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METHODS OF DIAGNOSIS OF LUNG CANCER, COMPOSITIONS AND METHODS 
OF SCREENING FOR MODULATORS OF LUNG CANCER 



CROSS-REFERENCES TO RELATED APPLICATIONS 

This application is related to USSN 60/284,770, filed April 18, 2001; USSN 
60/290,492, filed May 10, 2001; USSN 60/334,370, filed November 29, 2001; USSN 
60/339,245, filed November 9, 2001; USSN 60/350,666, filed November 13, 2001; and 
10 USSN 60/xxx,xxx, filed April 12, 2002 (Docket OMNI-002P); each of which is incorporated 
herein by reference in its entirety. 

FIELD OF THE INVENTION 
The invention relates to the identification of nucleic acid and protein expression 
15 profiles and nucleic acids, products, and antibodies thereto that are involved in lung cancer; 
and to the use of such expression profiles and compositions in diagnosis and therapy of lung 
cancer. The invention further relates to methods for identifying and using agents and/or 
targets that inhibit lung cancer or related conditions. 

20 BACKGROUND OF THE INVENTION 

Lung cancer is the second most commonly occurring cancer in the United States and 
is the leading cause of cancer-related death. It is estimated that there are over 160,000 new 
cases of lung cancer in the United States every year. Of those who are diagnosed with lung 
cancer, 86 percent will die within five years. Lung cancer is the most common visceral 

25 cancer in men and accounts for nearly one third of all cancer deaths in both men and women. 
In fact, lung cancer accounts for 7% of all deaths, due to any cause, in both men and women. 

Smoking is the primary cause of lung cancer, with more than 80% of lung cancers 
resulting from smoking. About 400 to 500 separate gaseous substances are present in the 
smoke of a non-filter cigarette. The most noteworthy substances include nitrogen oxides, 

30 hydrogen cyanide, formaldehyde, benzene, and toluene. The particles present in cigarette 
smoke contain at least 3,500 individual compounds such as nicotine, tobacco alkaloids 
(nornicotine, anatabine, anabasine), polycyclic aromatic hydrocarbons (e.g., benzo(a)pyrene, 
B(a)P), naphthalenes, aromatic amines, phenols, and tobacco-specific nitrosamines. 
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Tobacco-specific nitrosamines are formed during tobacco curing and processing, and 

are suspected of causing lung cancer in humans. In rodent studies, regardless of the where or 

how it is applied, the tobacco-specific nitrosamine known as NNK produces lung adenomas 

and lung adenocarcinomas. The tobacco-specific nitrosamine known as NNAL also produces 

5 lung adenocarcinomas in rodents. 

Many of the chemicals found in cigarette smoke also affect the nonsmoker inhaling 

"secondhand" or sidestream smoke. Indeed, the smoke inhaled by non-smokers has a 

chemical composition similar to the smoke inhaled by smokers, but, importantly, the 

concentrations of the carcinogenic tobacco-specific nitrosamines are present in higher 

10 concentrations in second hand smoke. For this and other reasons, "passive smoking" is an 

important cause of lung cancer, causing as many as 3,000 lung cancer deaths in nonsmokers 

each year. 

In addition to smoking, other factors thought to be causes of lung cancer include on- 
the-job exposure to carcinogens such as asbestos and uranium, exposure to chemical hazards 

1 5 such as radon, polycyclic aromatic hydrocarbons, chromium, nickel, and inorganic arsenic, 
genetic factors, and diet. 

Histological classification of various lung cancers define the types of cancer that 
begin in the lung. See, e.g., Travis, et al. (1999) Histological Typing of Lu ng and Pleural 
Tumours (International Histological Classification of Tumours, No 1 . Four major cell types 

20 make up more than 88% of all primary lung neoplasms. These are: squamous or epidermoid 
carcinoma, small cell (also called oat cell) carcinoma, adenocarcinoma, and large cell (also 
called large cell anaplastic) carcinoma. The remainder include undifferentiated carcinomas, 
carcinoids, bronchial gland tumors, and other rarer types. The various cell types have 
different natural histories and responses to therapy, and, thus, a correct histologic diagnosis is 

25 the first step of effective treatment. 

Small cell lung cancer (SCLC) accounts for 18-25% of all lung cancers, and occurs 
less frequently than non-small cell lung cancers, and generally spread to distant organs more 
rapidly than non-small cell lung cancer. In general, at the time of presentation small cell lung 
cancers have already spread beyond the beyond the bounds where surgery and curative intent 

30 can be undertaken. Hoever, if identified early enough, these cancers are often responsive to 
chemotherapy and thoracic radiation treatment. 

Non-small cell lung cancers (NSCLC) are the more frequently occurring form of lung 
cancer. They comprise squamous cell carcinoma, adenocarcinoma, and large cell carcinoma 
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and account for more than 75% of all lung cancers. Non-small cell tumors that are localized 
at the time of presentation can sometimes be cured with surgery and/or radiotherapy, but 
usually are not identified until significant metastasis has occurred, which are typically not 
very responsive to surgical, chemotherapy, or radiation treatment.. 
5 The screening of asymptomatic persons at high risk for lung cancer has often proven 

ineffective. In general, only 5 to 15 percent of lung cancer patients have their disease 
detected while they are asymptomatic. Of course, early detection and treatment are critical 
factors in the fight against lung cancer. The average survival rate is 49% for those whose 
cancer is detected early, before the cancer has spread from the lung. Lung cancer often 

1 0 spreads outside of the lung, and it may have spread to the bones or brain by the time it is 
diagnosed. While the prognosis may be better for lung cancers that are detected early, 
because of the lack ofV effective curative treatments, early detection does not necessarily alter 
the total death rate from lung cancer. 

Thus, methods for diagnosis and prognosis of lung cancer and effective treatment of 

1 5 lung cancer would be desirable. Accordingly, provided herein are methods that can be used 
in diagnosis and prognosis of lung cancer. Further provided are methods that can be used to 
screen candidate therapeutic agents for the ability to modulate, e.g., treat, lung cancer. 
Additionally, provided herein are molecular targets and compositions for therapeutic 
intervention in lung disease and other metastatic cancers. 

20 ■ , 

SUMMARY OF THE INVENTION 
The present invention provides nucleotide sequences of genes that are up- and down- 
regulated in lung cancer cells. Such genes are useful for diagnostic purposes, and also as 
targets for screening for therapeutic compounds that modulate lung cancer, such as 

25 antibodies. The methods of detecting nucleic acids of the invention or their encoded proteins 
can be used for a number of purposes. Examples include early detection of lung cancers, 
monitoring and early detection of relapse following treatment of lung cancers, monitoring 
response to therapy of lung cancers, determining prognosis of lung cancers, directing therapy 
of lung cancers, selecting patients for postoperative chemotherapy or radiation therapy, 

30 selecting therapy, detennining tumor prognosis, treatment, or response to treatment, and early 
detection of precancerous lesions of the lung. Examples of benign or precancerous lesions 
include: atelectasis, emphysema, brochitis, chronic obstructive pulmonary disease, fibrosis, 
hypersensitivity pneumonitis (HP), interstitial pulmonary fibrosis (EPF), asthma, and 
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bronchiectasis. Other aspects of the invention will become apparent to the skilled artisan by 

the following description of the invention. 

In one aspect, the present invention provides a method of detecting a lung cancer- 
associated transcript in a cell from a patient, the method comprising contacting a biological 
5 sample from the patient with a polynucleotide that selectively hybridizes to a sequence at 
least 80% identical to a sequence as shown in Tables 1A-16. Alternatively, the sample may 
be contacted with a specific binding reagent, e.g., antibody. 

In one embodiment, the polynucleotide selectively hybridizes to a sequence at least 
95% identical to a sequence as shown in Tables 1A-16. In another embodiment, the 
10 polynucleotide comprises a sequence as shown in Tables 1A-16. 

In one embodiment, the biological sample is a tissue sample, or a body fluid. In 
another embodiment, the biological sample comprises isolated nucleic acids, e.g., mRNA. 

In one embodiment, the polynucleotide is labeled, e.g., with a fluorescent label. In 
one embodiment, the polynucleotide is immobilized on a solid surface. In one embodiment, 
1 5 the patient is undergoing a therapeutic regimen to treat lung cancer. In another embodiment, 
the patient is suspected of having lung cancer. In one embodiment, the patient is a primate, 
e.g., a human. 

In one embodiment, the method further comprises the step of amplifying nucleic acids 
before the step of contacting the biological sample with the polynucleotide. 

20 In another aspect, the present invention provides a method of monitoring the efficacy 

of a therapeutic treatment of lung cancer, the method comprising the steps of: (i) providing a 
biological sample from a patient undergoing the therapeutic treatment; and (ii) determining 
the level of a lung cancer-associated transcript in the biological sample by contacting the 
biological sample with a polynucleotide that selectively hybridizes to a sequence at least 80% 

25 identical to a sequence as shown in Tables 1 A-16, thereby monitoring the efficacy of the 
therapy. Or the sample may be evaluated for protein, e.g., contacting the sample with an 
antibody. 

In one embodiment, the method further comprises the step of: (iii) comparing the 
level of the lung cancer-associated transcript to a level of the lung cancer-associated 
30 transcript in a biological sample from the patient prior to, or earlier in, the therapeutic 
treatment. Or the sample may be evalated for comparison of protein. 

In another aspect, the present invention provides a method of monitoring the efficacy 
of a therapeutic treatment of lung cancer, the method comprising the steps of: (i) providing a 
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biological sample from a patient undergoing the therapeutic treatment; and (ii) determining 
the level of a lung cancer-associated antibody in the biological sample by contacting the 
biological sample with a polypeptide encoded by a polynucleotide that selectively hybridizes 
to a sequence at least 80% identical to a sequence as shown in Tables .1 A-16, wherein the 
5 polypeptide specifically binds to the lung cancer-associated antibody, thereby monitoring the 
efficacy of the therapy. 

In one embodiment, the method further comprises the step of: (iii) comparing the 
level of the lung cancer-associated antibody to a level of the lung cancer-associated antibody 
in a biological sample from the patient prior to, or earlier in, the therapeutic treatment. 

10 In another aspect, the present invention provides a method of monitoring the efficacy 

of a therapeutic treatment of lung cancer, the method comprising the steps of: (i) providing a 
biological sample from a patient undergoing the therapeutic treatment; and (ii) determining 
the level of a lung cancer-associated polypeptide in the biological sample by contacting the 
biological sample with an antibody, wherein the antibody specifically binds to a polypeptide 

15 encoded by a polynucleotide that selectively hybridizes to a sequence at least 80% identical 
to a sequence as shown in Tables 1 A-16, thereby monitoring the efficacy of the therapy. 

In one embodiment, the method further comprises the step of: (iii) comparing the 
level of the lung cancer-associated polypeptide to a level of the lung cancer-associated 
polypeptide in a biological sample from the patient prior to, or earlier in, the therapeutic 

20 treatment. In one aspect, the present invention provides an isolated nucleic acid molecule 
consisting of a polynucleotide sequence as shown in Tables 1A-16. In one embodiment, an 
expression vector or cell comprises the isolated nucleic acid. In one aspect, the present 
invention provides an isolated polypeptide which is encoded by a nucleic acid molecule 
having polynucleotide sequence as shown in Tables 1 A-16. 

25 In another aspect, the present invention provides an antibody that specifically binds to 

an isolated polypeptide which is encoded by a nucleic acid molecule having polynucleotide 
sequence as shown in Tables 1 A-16. In one embodiment, the antibody is conjugated to an 
effector component, e.g., a fluorescent label, a radioisotope or a cytotoxic chemical. In one 
embodiment, the antibody is an antibody fragment. In another embodiment, the antibody is 

30 humanized. 

In one aspect, the present invention provides a method of detecting lung cancer in a a 
patient, the method comprising contacting a biological sample from the patient with an 
antibody or protein as described herein. 
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In another aspect, the present invention provides a method of detecting antibodies 

specific to a lung cancer gene in a patient, the method comprising contacting a biological 

sample from the patient with a polypeptide encoded by a nucleic acid comprises a sequence 

from Tables 1 A-16. 

5 In another aspect, the present invention provides a method for identifying a compound 

that modulates a lung cancer-associated polypeptide, the method comprising the steps of: (i) 
contacting the compound with a lung cancer-associated polypeptide, the polypeptide encoded 
by a polynucleotide that selectively hybridizes to a sequence at least 80% identical to a 
sequence as shown in Tables 1A-16; and (ii) determining the functional effect of the 

1 0 compound upon the polypeptide. 

In one embodiment, the functional effect is a physical effect, an enzymatic effect, or a 
chemical effect. In one embodiment, the polypeptide is expressed in a eukaryotic host cell or 
cell membrane. In another embodiment, the polypeptide is recombinant. In one 
embodiment, the functional effect is determined by measuring ligand binding to the 

15 polypeptide. 

In another aspect, the present invention provides a method of inhibiting proliferation 
or another critical process of a lung cancer-associated cell to treat lung cancer in a patient, the 
method comprising the step of administering to the subject a therapeutically effective amount 
of a compound identified as described herein. In one embodiment, the compound is an 
20 antibody. 

In another aspect, the present invention provides a drug screening assay comprising 
the steps of: (i) administering a test compound to a mammal having lung cancer or a cell 
isolated therefrom; (ii) comparing the level of gene expression of a polynucleotide that 
selectively hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 
25 1 A-16 in a treated cell or mammal with the level of gene expression of the polynucleotide in 
a control cell or mammal, wherein a test compound that modulates the level of expression of 
the polynucleotide is a candidate for the treatment of lung cancer. 

In one embodiment, the control is a mammal with lung cancer or a cell therefrom that 
has not been treated with the test compound. In another embodiment, the control is a normal 
30 cell or mammal, or a non-malignant lung disease. 

In another aspect, the present invention provides a method for treating a mammal 
having lung cancer comprising administering a compound identified by the assay described 
herein. 
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In another aspect, the present invention provides a pharmaceutical composition for 

treating a mammal having lung cancer, the composition comprising a compound identified by 

the assay described herein and a physiologically acceptable excipient 

5 DETAILED DESCRIPTION OF THE INVENTION 

In accordance with the objects outlined above, the present invention provides novel 
methods for diagnosis and treatment of lung disease or cancer, as well as methods for 
screening for compositions which modulate lung cancer. "Treatment, monitoring, detection 
or modulation of lung disease or cancer" includes treatment, monitoring, detection, or 

10 modulation of lung disease in those patients who have lung disease (whether malignant or 
non-malignant, e.g., emphysema, bronchitis, or fibrosis) as well as patients with lung cancers 
in which gene expression from a gene in Tables 1 A-16 is increased or decreased, indicating 
that the subject is more likely to have disease. In particular,while these targets are identified 
primarily from lung cancer samples, these same targets are likely to be similarly found in 

15 analyses of other medical conditions. These other conditions may result from similar 
pathological processes which affect similar tissues, e.g., lung cancer, small cell lung 
carcinoma (oat cell carcinoma), non-small cell carcinomas (e.g., squamous cell carcinoma, 
adenocarcinoma, large cell lung carcinoma, carcinoid, granulomatous), fibrosis (idiopathic 
pulmonary fibrosis (IPF), hypersensitivity pneumonitis (HP), interstitial pneumonitis, 

20 nonspecific idiopathic pneumonitis (NSBP)), chronic obstructive pulmonary disease (COPD, 
e.g., emphysema, chronic bronchitis), asthma, bronchiectasis, and esophageal cancer. See, 
e.g., the NCI webpage and USSN 60/347,349 and USSN 60/xxx,xxx (docket LFBR-001-1P, 
filed March 29, 2002), each of which is incorporated herein by reference. The treatment may 
be of lung cancer or related condition itself, or treatment of metastasis. 

25 In particular, identification of markers selectively expressed on these cancers allows 

for use of that expression in diagnostic, prognostic, or therapeutic methods. As such, the 
invention defines various compositions, e.g., nucleic acids, polypeptides, antibodies, and 
small molecule agonists/antagonists, which will be useful to selectively identify those 
markers. For example, therapeutic methods may take the form of protein therapeutics which 

30 use the marker expression for selective localization or modulation of function (for those 
markers which have a causative disease effect), for vaccines, identification of binding 
partners, or antagonism, e.g., using antisense or RNAi. The markers may be useful for 
molecular characterization of subsets of lung diseases, which subsets may actually require 
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very different treatments. Moreover, the markers may also be important in related diseases to 
the specific cancers, e.g., which affect similar tissues in non-malignant diseases, or have 
similar mechanisms of induction/maintenance. Metastatic processes or characteristics may 
also be targeted. Diagnostic and prognostic uses are made available, e.g., to subset related 
5 but distinct diseases, or to determine treatment strategy. The detection methods may be based 
upon nucleic acid, e.g., PCR or hybridization techniques, or protein, e.g., ELISA, imaging, 
EHC, etc. The diagnosis may be qualitative or quantitative, and may detect increases or 
decreases in expression levels. 

Tables 1A-16 provide unigene cluster identification numbers for the nucleotide 

10 sequence of genes that exhibit increased or decreased expression in lung cancer samples. The 
tables also provide an exemplar accession number that provides a nucleotide sequence that is 
part of the unigene cluster. In Table 1 A, genes marked as "target 1" or "target 2" are 
particularly useful as therapeutic targets. Genes marked as "target 3" are particularly useful 
as diagnostic markers. Genes marked as "chron" are upregulated in chronically diseased lung 

15 (e.g., emphysema, bronchitis, fibrosis) relative to lung tumors and normal tissue. In certain 
analyses, the ratio for the "chron" category was determined using the 70th percentile of 
chronically diseases lung samples divided by the 90th percentile of normal lung samples. 
The ratio for the targets was determined using the 70th percentile of lung tumor samples 
divided by the 90th percentile of normal lung samples. 

20 

Definitions 

The term "lung cancer protein" or "lung cancer polynucleotide" or "lung cancer- 
associated transcript" refers to nucleic acid and polypeptide polymorphic variants, alleles, 
mutants, and interspecies homologs that: (1) have a nucleotide sequence that has greater than 

25 about 60% nucleotide sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 
92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% or greater nucleotide sequence identity, 
preferably over a region of over a region of at least about 25, 50, 100, 200, 500, 1000, or 
more nucleotides, to a nucleotide sequence of or associated with a unigene cluster of Tables 
1A-16; (2) bind to antibodies, e.g., polyclonal antibodies, raised against an immunogen 

30 comprising an amino acid sequence encoded by a nucleotide sequence of or associated with a 
unigene cluster of Tables 1 A- 16, and conservatively modified variants thereof; (3) 
specifically hybridize under stringent hybridization conditions to a nucleic acid sequence, or 
the complement thereof of Tables 1A-16 and conservatively modified variants thereof; or (4) 
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have an amino acid sequence that has greater than about 60% amino acid sequence identity, 
65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 
or 99% or greater amino sequence identity, preferably over a region of over a region of at 
least about 25, 50, 100, 200, 500, 1000, or more amino acid, to an amino acid sequence 
5 encoded by a nucleotide sequence of or associated with a unigene cluster of Tables 1A-16. A 
polynucleotide or polypeptide sequence is typically from a mammal including, but not 
limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or 
other mammal. A "lung cancer polypeptide" and a "lung cancer polynucleotide," include 
both naturally occurring or recombinant forms. 

10 A "full length" lung cancer protein or nucleic acid refers to a lung cancer polypeptide 

or polynucleotide sequence, or a variant thereof, that contains the elements normally 
contained in one or more naturally occurring, wild type lung cancer polynucleotide or 
polypeptide sequences. The "foil length" may be prior to, or after, various stages of post- 
translational processing or splicing, including alternative splicing. 

15 "Biological sample" as used herein is a sample of biological tissue or fluid that 

contains nucleic acids or polypeptides, e.g., of a lung cancer protein, polynucleotide, or 
transcript. Such samples include, but are not limited to, tissue isolated from primates, e.g., 
humans, or rodents, e.g., mice, and rats. Biological samples may also include sections of 
tissues such as biopsy and autopsy samples, frozen sections taken for histologic purposes, 

20 archival materials, blood, plasma, serum, sputum, stool, tears, mucus, hair, skin, etc. 
Biological samples also include explants and primary and/or transformed cell cultures 
derived from patient tissues. A biological sample is typically obtained from a eukaryotic 
organism, most preferably a mammal such as a primate, e.g., chimpanzee or human; cow; 
dog; cat; a rodent, e.g., guinea pig, rat, mouse; rabbit; or other mammal; or a bird; reptile; 

25 fish. Livestock and domestic animals are of interest. 

"Providing a biological sample" means to obtain a biological sample for use in 
methods described in this invention. Most often, this will be done by removing a sample of 
cells from an animal, but can also be accomplished by using previously isolated cells (e.g., 
isolated by another person, at another time, and/or for another purpose), or by performing the 

30 methods of the invention in vivo. Archival tissues or materials, having treatment or outcome 
history, will be particularly useful. 

The terms "identical" or percent "identity," in the context of two or more nucleic 
acids or polypeptide sequences, refer to two or more sequences or subsequences that are the 
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same or have a specified percentage of amino acid residues or nucleotides that are the same 
(e.g., about 60% identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 
95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared and 
aligned for maximum correspondence over a comparison window or designated region) as 
5 measured using, e.g., a BLAST or BLAST 2.0 sequence comparison algorithms with default 
parameters described below, or by manual alignment and visual inspection (see, e.g., NCBI 
web site http://www.ncbi.nlm.nih.gov/BLAST/ or the like). Such sequences are then said to 
be "substantially identical" This definition also refers to, or may be applied to, the 
complement of a test sequence. The definition also includes sequences that have deletions 
10 and/or insertions, substitutions, and naturally occurring, e.g., polymorphic or allelic variants, 
and man-made variants. As described below, the preferred algorithms can account for gaps 
and the like. Preferably, identity exists over a region that is at least about 25 amino acids or 
nucleotides in length, or more preferably over a region that is 50-100 amino acids or 
nucleotides in length. 

15 For sequence comparison, typically one sequence acts as a reference sequence, to 

which test sequences are compared. When using a sequence comparison algorithm, test and 
reference sequences are entered into a computer, subsequence coordinates are designated, if 
necessary, and sequence algorithm program parameters are designated. Preferably, default 
program parameters can be used, or alternative parameters can be designated. The sequence 

20 comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "comparison window", as used herein, includes reference to a segment of 
contiguous positions selected from the group consisting typically of from 20 to 600, usually 
about 50 to about 200, more usually about 100 to about 150 in which a sequence may be 

25 compared to a reference sequence of the same number of contiguous positions after the two 
sequences are optimally aligned. Methods of alignment of sequences for comparison are 
well-known in the art. Optimal alignment of sequences for comparison can be conducted, 
e.g., by the local homology algorithm of Smith and Waterman (1981) Adv. Appl. Math. 
2:482, by the homology alignment algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 

30 48:443, by the search for similarity method of Pearson and Lipman (1988) Proc. Nafl. Acad. 
Sci. USA 85:2444, by computerized implementations of these algorithms (GAP, BESTFIT, 
FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer 
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Group, 575 Science Dr., Madison, WI), or by manual alignment and visual inspection (see, 

e.g., Ausubel, et al. (eds. 1995 and supplements) Current Protocols in Mo lecular Biology. 

Preferred examples of algorithms that are suitable for determining percent sequence 

identity and sequence similarity include the BLAST and BLAST 2.0 algorithms, which are 

5 described in Altschul, et al. (1977) Nuc. Acids Res. 25:3389-3402 and Altschul, et al. (1990) 

J. Mol. Biol. 215:403-410. BLAST and BLAST 2.0 are used, with the parameters described 

herein, to determine percent sequence identity for the nucleic acids and proteins of the 

invention. Software for performing BLAST analyses is publicly available through the 

National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This 

10 algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive-valued 
threshold score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul, et al., supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing 

15 them. The word hits are extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, e.g., 
for nucleotide sequences, the parameters M (reward score for a pair of matching residues; 
always > 0) and N (penalty score for mismatching residues; always < 0). For amino acid 
sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word 

20 hits in each direction are halted when: the cumulative alignment score falls off by the 

quantity X from its maximum achieved value; the cumulative score goes to zero or below, 
due to the accumulation of one or more negative-scoring residue alignments; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 

. 25 uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, M=5, N— 4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 
defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915) alignments (B) of 
50, expectation (E) of 10, M=5, N— 4, and a comparison of both strands. 

30 The BLAST algorithm also performs a statistical analysis of the similarity between 

two sequences (see, e.g., Karlin and Altschul (1993) Proc. Nat'l. Acad. Sci. USA 90:5873- 
5787). One measure of similarity provided by the BLAST algorithm is the smallest sum 
probability (P(N)), which provides an indication of the probability by which a match between 



11 



WO 02/086443 PCT/US02/12476 
two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid 

is considered similar to a reference sequence if the smallest sum probability in a comparison 

of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably 

less than about 0.01, and most preferably less than about 0.001. Log values may be negative 

5 large numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 110, 150, 170, etc. 

An indication that two nucleic acid sequences are substantially identical is that the 

polypeptide encoded by the first nucleic acid is immunologically cross reactive with the 

antibodies raised against the polypeptide encoded by the second nucleic acid. Thus, a 

polypeptide is typically substantially identical to a second polypeptide, e.g., where the two 

10 peptides differ only by conservative substitutions. Another indication that two nucleic acid 

sequences are substantially identical is that the two molecules or their complements hybridize 

to each other under stringent conditions. Yet another indication that two nucleic acid 

sequences are substantially identical is that the same primers can be used to amplify the 

sequences. 

1 5 A "host cell" is a naturally occurring cell or a transformed cell that contains an 

expression vector and supports the replication or expression of the expression vector. Host 
cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be 
prokaryotic cells such as E. coli, or eukaryotic cells such as yeast, insect, amphibian, or 
mammalian cells such as CHO, HeLa, and the like (see, e.g., the American Type Culture 

20 Collection catalog or web site, www.atcc.org). 

The terms "isolated," "purified," or 'biologically pure" refer to material that is 
substantially or essentially free from components that normally accompany it as found in its 
native state. Purity and homogeneity are typically determined using analytical chemistry 
techniques such as polyacrylamide gel electrophoresis or high performance liquid 

25 chromatography. A protein or nucleic acid that is the predominant species present in a 

preparation is substantially purified. In particular, an isolated nucleic acid is separated from 
some open reading frames that naturally flank the gene and encode proteins other than protein 
encoded by the gene. The term "purified" in some embodiments denotes that a nucleic acid 
or protein gives rise to essentially one band in an electrophoretic gel. Preferably, it means 

30 that the nucleic acid or protein is at least about 85% pure, more preferably at least 95% pure, 
and most preferably at least 99% pure. "Purify" or "purification" in other embodiments 
means removing at least one contaminant or component from the composition to be purified. 
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In this sense, purification does not require that the purified compound be homogeneous, e.g., 

100% pure. 

The terms "polypeptide/ 5 '"peptide" and "protein" are used interchangeably herein to 
refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which 
5 one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally 
occurring amino acid, as well as to naturally occurring amino acid polymers, those containing 
modified residues, and non-naturally occurring amino acid polymer. 

The term "amino acid" refers to naturally occurring and synthetic amino acids, as well 
as amino acid analogs and amino acid mimetics that function similarly to the naturally 

10 occurring amino acids. Naturally occurring amino acids are those encoded by the genetic 
code, as well as those amino acids that are later modified, e.g., hydroxyproline, 7- 
carboxyglutamate, and O-phosphoserine. Amino acid analogs refer to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, e.g., an a carbon that is 
bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, 

15 norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs may have 
modified R groups (e.g., norleucine) or modified peptide backbones, but retain some basic 
chemical structure as a naturally occurring amino acid. Amino acid mimetics refer to 
chemical compounds that have a structure that is different from the general chemical 
structure of an amino acid, but that function similarly to another amino acid. 

20 Amino acids may be referred to herein by either their commonly known three letter 

symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical 
Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly 
accepted single-letter codes. 

"Conservatively modified variants" applies to both amino acid and nucleic acid 

25 sequences. With respect to particular nucleic acid sequences, conservatively modified 
variants refers to those nucleic acids which encode identical or essentially identical amino 
acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical or associated, e.g., naturally contiguous, sequences. Because of the 
degeneracy of the genetic code, a large number of functionally identical nucleic acids encode 

30 most proteins. For instance, the codons GCA, GCC, GCG, and GCU each encode the amino 
acid alanine. Thus, at each position where an alanine is specified by a codon, the codon can 
be altered to another of the corresponding codons described without altering the encoded 
polypeptide. Such nucleic acid variations are "silent variations," which are one species of 
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conservatively modified variations. Every nucleic acid sequence herein which encodes a 

polypeptide also describes silent variations of the nucleic acid. In certain contexts each 

codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and 

TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a 

5 functionally similar molecule. Accordingly, a silent variation of a nucleic acid which 

encodes a polypeptide is implicit in a described sequence with respect to the expression 

product, but not necessarily with respect to actual probe sequences. 

As to amino acid sequences, one of skill will recognize that individual substitutions, 

deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which 

10 alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded 
sequence is a "conservatively modified variant" where the alteration results in the substitution 
of an amino acid with a chemically similar amino acid. Conservative substitution tables 
providing functionally similar amino acids are well known in the art. Such conservatively 
modified variants are in addition to and do not exclude polymorphic variants, interspecies 

15 homologs, and alleles of the invention. Typically conservative substitutions include for one 
another: 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), Glutamic acid (E); 3) Asparagine 
(N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine 
(M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S), 
Threonine (T); and 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, Proteins (1984)). 

20 Macromolecular structures such as polypeptide structures can be described in terms of 

various levels of organization. For a general discussion of this organization, see, e.g., 
Alberts, et al. (1994) Molecular Biology of the Cell (3 rd ed.) and Cantor and Schimmel (1980) 
Biophysical Chemistry Part I: The Conformation of Biological Macromolecules . "Primary 
structure" refers to the amino acid sequence of a particular peptide. "Secondary structure" 

25 refers to locally ordered, three dimensional structures within a polypeptide. These structures 
are commonly known as domains. Domains are portions of a polypeptide that often form a 
compact unit of the polypeptide and are typically 25 to approximately 500 amino acids long. 
Typical domains are made up of sections of lesser organization such as stretches of p-sheet 
and a-helices. "Tertiary structure" refers to the complete three dimensional structure of a 

30 polypeptide monomer. "Quaternary structure" refers to the three dimensional structure 
formed, usually by the noncovalent association of independent tertiary units. Anisotropic 
terms are also known as energy terms. 
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'"Nucleic acid'* or "oligonucleotide" or "polynucleotide" or grammatical equivalents 

used herein means at least two nucleotides covalently linked together. Oligonucleotides are 

typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more nucleotides in length, up 

to about 100 nucleotides in length. Nucleic acids and polynucleotides are a polymers of any 

5 length, including longer lengths, e.g., 200, 300, 500, 1000, 2000, 3000, 5000, 7000, 10,000, 

etc. A nucleic acid of the present invention will generally contain phosphodiester bonds, 

although in some cases, nucleic acid analogs are included that may have at least one different 

linkage, e.g., phosphoramidate, phosphorothioate, phosphorodithioate, or O- 

methylphophoroamidite linkages (see Eckstein (1992) Oligonucleotides and Analogues: A 

10 Practical Approach Oxford University Press); and peptide nucleic acid backbones and 
linkages. Other analog nucleic acids include those with positive backbones; non-ionic 
backbones, and non-ribose backbones, including those. described in U.S. Patent Nos. 
5,235,033 and 5,034,506, and Chapters 6 and 7, in Sanghui and Cook, eds. Carbohydrate 
Modifications in Antisense Research. ASC Symposium Series 580. Nucleic acids containing 

15 one or more carbocyclic sugars are also included within one definition of nucleic acids. 

Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to 
increase the stability and half-life of such molecules in physiological environments or as 
probes on a biochip. Mixtures of naturally occurring nucleic acids and analogs can be made; 
alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring 

20 nucleic acids and analogs may be made. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide nucleic 
acid analogs. These backbones are substantially non-ionic under neutral conditions, in 
contrast to the highly charged phosphodiester backbone of naturally occurring nucleic acids. 
This results in two advantages. First, the PNA backbone exhibits improved hybridization 

25 kinetics. PNAs have larger changes in the melting temperature (T m ) for mismatched versus 
perfectly matched basepairs. DNA and RNA typically exhibit a 2-4° C drop in T m for an 
internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9° C. 
Similarly, due to their non-ionic nature, hybridization of the bases attached to these 
backbones is relatively insensitive to salt concentration. In addition, PNAs are not degraded 

30 by cellular enzymes, and thus can be more stable. 

The nucleic acids may be single stranded or double stranded, as specified, or contain 
portions of both double stranded or single stranded sequence. As will be appreciated by those 
in the art, the depiction of a single strand also defines the sequence of the complementary 

15 
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strand; thus the sequences described herein also provide the complement of the sequence. 

The nucleic acid may be DNA, both genomic and cDNA, RNA, or a hybrid, where the 

nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations 

of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine 

5 hypoxanthine, isocytosine, isoguanine, etc. "Transcript" typically refers to a naturally 

occurring RNA, e.g., a pre-mRNA, hnRNA, or mRNA. As used herein, the term 

"nucleoside" includes nucleotides and nucleoside and nucleotide analogs, and modified 

nucleosides such as amino modified nucleosides. In addition, "nucleoside" includes non- 

naturally occurring analog structures. Thus, e.g., the individual units of a peptide nucleic 

10 acid, each containing a base, are referred to herein as a nucleoside. 

A "label" or a "detectable moiety" is a composition detectable by spectroscopic, 
photochemical, biochemical, immunochemical, physiological, chemical, or other physical 
means. For example, useful labels include 32 P, fluorescent dyes, electron-dense reagents, 
enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins 

15 or other entities which can be made detectable, e.g., by incorporating a radiolabel into the 
peptide or used to detect antibodies specifically reactive with the peptide. The labels may be 
incorporated into the cancer nucleic acids, proteins, and antibodies. Many methods known in 
the art for conjugating the antibody to the label may be employed, including those methods 
described by Hunter, et al. (1962) Nature 144:945; David, et al. (1974) Biochemistry 

20 13:1014-1021; Pain, et al. (1981) J. Immunol. Meth. . 40:219-230; and Nygren (1982) J. 
Histochem. and Cvtochem. 30:407-412. 

An "effector" or "effector moiety" or "effector component" is a molecule that is 
bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds, to an antibody. 

25 The "effector" can be a variety of molecules including, e.g., detection moieties including 

radioactive compounds, fluorescent compounds, an enzyme or substrate, tags such as epitope 
tags, a toxin; activatable moieties, a chemotherapeutic agent; a lipase; an antibiotic; or a 
radioisotope emitting "hard" e.g., beta radiation. 

A "labeled nucleic acid probe or oligonucleotide" is one that is bound, either 

30 covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der 
Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe may be 
detected by detecting the presence of the label bound to the probe. Alternatively, method 
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using high affinity interactions may achieve the same results where one of a pair of binding 

partners binds to the other, e.g., biotin, streptavidin. 

As used herein a ''nucleic acid probe or oligonucleotide" is a nucleic acid capable of 

binding to a target nucleic acid of complementary sequence through one or more types of 

5 chemical bonds, usually through complementary base pairing, e.g., through hydrogen bond 

formation. As used herein, a probe may include natural (i.e., A, G, C, or T) or modified bases 

(7-deazaguanosine, inosine, etc.). In addition, the bases in a probe may be joined by a 

linkage other than a phosphodiester bond, preferably one that does not functionally interfere 

with hybridization. Thus, e.g., probes may be peptide nucleic acids in which the constituent 

10 bases are joined by peptide bonds rather than phosphodiester linkages. Probes may bind 

target sequences lacking complete complementarity with the probe sequence depending upon 
the stringency of the hybridization conditions. The probes are preferably directly labeled, 
e.g., with isotopes, chromophores, lumiphores, chromogens, or indirectly labeled, e.g., with 
biotin to which a streptavidin complex may later bind. By assaying for the presence or 

15 absence of the probe, one can detect the presence or absence of the select sequence or 

subsequence. Diagnosis or prognosis may be based at the genomic level, or at the level of 
RNA or protein expression. 

The term "recombinant" when used with reference, e.g., to a cell, or nucleic acid, 
protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been modified by 

20 the introduction of a heterologous nucleic acid or protein or the alteration of a native nucleic 
• acid or protein, or that the cell is derived from a cell so modified. Thus, e.g., recombinant 
cells express genes that are not found within the native (non-recombinant) form of the cell or 
express native genes that are otherwise abnormally expressed, under expressed or not 
expressed at all. By the term "recombinant nucleic acid" herein is meant nucleic acid, 

25 originally formed in vitro, in general, by the manipulation of nucleic acid, e.g., using 
polymerases and endonucleases, in a form not normally found in nature. In this manner, 
operably linkage of different sequences is achieved. Thus an isolated nucleic acid, in a linear 
form, or an expression vector formed in vitro by ligating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this invention. It is 

30 understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombinantly, i.e., using the in vivo cellular machinery of the 
host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently replicated non-recombinantly, are still considered 
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recombinant for the purposes of the inventioa Similarly, a "recombinant protein" is a protein 

made using recombinant techniques, i.e., through the expression of a recombinant nucleic 

acid as depicted above. 

The term "heterologous" when used with reference to portions of a nucleic acid 

5 indicates that the nucleic acid comprises two or more subsequences that are not normally 

found in the same relationship to each other in nature. For instance, the nucleic acid is 

typically recombinantly produced, having two or more sequences, e.g., from unrelated genes 

arranged to make a new functional nucleic acid, e.g., a promoter from one source and a 

coding region from another source. Similarly, a heterologous protein will often refer to two 

10 or more subsequences that are not found in the same relationship to each other in nature (e.g., 

a fusion protein). 

A "promoter" is typically an array of nucleic acid control sequences that direct 
transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic acid 
sequences near the start site of transcription, such as, in the case of a polymerase II type 

15 promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start site of 
transcription. A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An "inducible" promoter is a promoter that is 
active under environmental or developmental regulation. The term "operably linked" refers 

20 to a functional linkage between a nucleic acid expression control sequence (such as a 

promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 
e.g., wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence. 

An "expression vector" is a nucleic acid construct, generated recombinantly or 

25 synthetically, with a series of specified nucleic acid elements that permit transcription of a 
particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or 
nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be 
transcribed in operable linkage to a promoter. 

The phrase "selectively (or specifically) hybridizes to" refers to the binding, 

30 duplexing, or hybridizing of a molecule selectively to a particular nucleotide sequence under 
stringent hybridization conditions when that sequence is present in a complex mixture (e.g., 
total cellular or library DNA or RNA). 
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The phrase "stringent hybridization conditions" refers to conditions under which a 

probe will hybridize to its target subsequence, typically in a complex mixture of nucleic 

acids, but to essentially no other sequences. Stringent conditions are sequence-dependent and 

will be different in different circumstances. Longer sequences hybridize specifically at 

5 higher temperatures. An extensive guide to the hybridization of nucleic acids is found in 

"Overview of principles of hybridization and the strategy of nucleic acid assays" in Tijssen 

(1993) Techniques in Biochemistry and Molecular Biology--Hvbridization with Nucleic 

Probes (vol. 24) Elsevier. Generally, stringent conditions are selected to be about 5-10° C 

lower than the thermal melting point (T^) for the specific sequence at a defined ionic strength 

10 pH. The T m is the temperature (under defined ionic strength, pH, and nucleic concentration) 
at which 50% of the probes complementary to the target hybridize to the target sequence at 
equilibrium (as the target sequences are present in excess, at T m , 50% of the probes are 
occupied at equilibrium). Stringent conditions will be those in which the salt concentration is 
less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or 

1 5 other salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C for short probes (e.g., 
10 to 50 nucleotides) and at least about 60° C for long probes (e.g., greater than 50 
nucleotides). Stringent conditions may also be achieved with the addition of destabilizing 
agents such as formamide. For selective or specific hybridization, a positive signal is 
typically at least two times background, preferably 10 times background hybridization. 

20 Exemplary stringent hybridization conditions are often: 50% formamide, 5x SSC, and 1% 
SDS, incubating at 42° C, or, 5x SSC, 1% SDS, incubating at 65° C, with wash in 0.2x SSC, 
and 0.1% SDS at 65° C. For PCR, a temperature of about 36° C is typical for low stringency 
amplification, although annealing temperatures may vary between about 32° C and 48° C 
depending on primer length. For high stringency PCR amplification, a temperature of about 

25 62° C is typical, although high stringency annealing temperatures can range from about 50° C 
to about 65° C, depending on the pripier length and specificity. Typical cycle conditions for 
both high and low stringency amplifications include a denaturation phase of 90° C - 95° C for 
0.5 - 2 min., an annealing phase lasting 0.5 - 2 min., and an extension phase of about 72° C 
for 1-2 min. Protocols and guidelines for low and high stringency amplification reactions 

30 are provided, e.g., in Innis, et al.(1990) PCR Protocols. A Guide to Methods and 
Applications . 

Nucleic acids that do not hybridize to each other under stringent conditions are still 
substantially identical if the polypeptides which they encode are substantially identical. This 
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occurs, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy 

permitted by the genetic code. In such cases, the nucleic acids typically hybridize under 

moderately stringent hybridization conditions. Exemplary "moderately stringent 

hybridization conditions" include a hybridization in a buffer of 40% formamide, 1 M NaCl, 

5 1% SDS at 37° C, and a wash in IX SSC at 45° C. A positive hybridization is at least twice 

background. Alternative hybridization and wash conditions can be utilized to provide 

conditions of similar stringency. Additional guidelines for determining hybridization 

parameters are provided in numerous reference, e.g., Ausubel, et al. (ed.) Current Protocols in 

Molecular Biology Lippincott. 

1 0 The phrase "functional effects" in the context of assays for testing compounds that 

modulate activity of a lung cancer protein includes the determination of a parameter that is 
indirectly or directly under the influence of the lung cancer protein or nucleic acid, e.g., a 
physiological, enzymatic, functional, physical, or chemical effect, such as the ability to 
decrease lung cancer. It includes ligand binding activity; cell viability, cell growth on soft 

15 agar; anchorage dependence; contact inhibition and density limitation of growth; cellular 
proliferation; cellular transformation; growth factor or serum dependence; tumor specific 
marker levels; invasiveness into Matrigel; tumor growth and metastasis in vivo; mRNA and 
protein expression in cells undergoing metastasis, and other characteristics of lung cancer 
cells, functional effects" include in vitro, in vivo, and ex vivo activities. 

20 By "determining the functional effect" is meant assaying for a compound that 

increases or decreases a parameter that is indirectly or directly under the influence of a lung 
cancer protein sequence, e.g., physiological, functional, enzymatic, physical, or chemical 
effects. Such functional effects can be measured by many means known to those skilled in 
the art, e.g., changes in spectroscopic characteristics (e.g., fluorescence, absorbance, 

25 refractive index), hydrodynamic (e.g., shape), chromatographic, or solubility properties for 
the protein, measuring inducible markers or transcriptional activation of the lung cancer 
protein; measuring binding activity or binding assays, e.g., binding to antibodies or other 
ligands, and measuring cellular proliferation. Determination of the functional effect of a 
compound on lung cancer can also be performed using lung cancer assays known to those of 

30 skill in the art such as an in vitro assays, e.g., cell growth on soft agar; anchorage 

dependence; contact inhibition and density limitation of growth; cellular proliferation; 
cellular transformation; growth factor or serum dependence; tumor specific marker levels; 
invasiveness into Matrigel; tumor growth and metastasis in vivo; mRNA and protein 
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expression in cells undergoing metastasis, and other characteristics of lung cancer cells. The 

functional effects can be evaluated by many means known to those skilled in the art, e.g., 

microscopy for quantitative or qualitative measures of alterations in morphological features, 

measurement of changes in RNA or protein levels for lung cancer-associated sequences, 

5 measurement of RNA stability, identification of downstream or reporter gene expression 

(CAT, luciferase, p-gal, GFP, and the like), e.g., via chemiluminescence, fluorescence, 

colorimetric reactions, antibody binding, inducible markers, and ligand binding assays. 

'Inhibitors", "activators", and ''modulators" of lung cancer polynucleotide and 

polypeptide sequences are used to refer to activating, inhibitory, or modulating molecules or 

10 compounds identified using in vitro and in vivo assays of lung cancer polynucleotide and 
polypeptide sequences. Inhibitors are compounds that, e.g., bind to, partially or totally block 
activity, decrease, prevent, delay activation, inactivate, desensitize, or down regulate the 
activity or expression of lung cancer proteins, e.g., antagonists. Antisense or inhibitory 
nucleic acids may seem to inhibit expression and subsequent function of the protein. 

15 "Activators" are compounds that increase, open, activate, facilitate, enhance activation, 
sensitize, agonize, or up regulate lung cancer protein activity. Inhibitors, activators, or 
modulators also include genetically modified versions of lung cancer proteins, e.g., versions 
with altered activity, as well as naturally occurring and synthetic ligands, antagonists, 
agonists, antibodies, small chemical molecules and the like. Such assays for inhibitors and 

20 activators include, e.g., expressing the lung cancer protein in vitro, in cells, or cell 

membranes, applying putative modulator compounds, and then determining the functional 
effects on activity, as described above. Activators and inhibitors of lung cancer can also be 
identified by incubating lung cancer cells with the test compound and determining increases 
or decreases in the expression of 1 or more lung cancer proteins, e.g., 1, 2, 3, 4, 5, 10, 15, 20, 

25 25, 30, 40, 50 or more lung cancer proteins, such as lung cancer proteins encoded by the 
sequences set out in Tables 1A-16. 

Samples or assays comprising lung cancer proteins that are treated with a potential 
activator, inhibitor, or modulator are compared to control samples without the inhibitor, 
activator, or modulator to examine the extent of inhibition. Control samples (untreated with 

30 inhibitors) are assigned a relative protein activity value of 100%. Inhibition of a polypeptide 
is achieved when the activity value relative to the control is about 80%, preferably 50%, more 
preferably 25-0%. Activation of a lung cancer polypeptide is achieved when the activity 
value relative to the control (untreated with activators) is 1 10%, more preferably 150%, more 
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preferably 200-500% (i.e., two to five fold higher relative to the control), more preferably 

1000-3000% higher. 

The phrase "changes in cell growth" refers to any change in cell growth and 

proliferation characteristics in vitro or in vivo, such as cell viability, formation of foci, 

5 anchorage independence, semi-solid or soft agar growth, changes in contact inhibition and 

density limitation of growth, loss of growth factor or serum requirements, changes in cell 

morphology, gaining or losing immortalization, gaining or losing tumor specific markers, 

ability to form or suppress tumors when injected into suitable animal hosts, and/or 

immortalization of the cell. See, e.g., Freshney (1994) Culture of An imal Cells a Manual of 

10 Basic Technique pp. 231-241 (3 rd ed.). 

"Tumor cell" refers to precancerous, cancerous, and normal cells in a tumor. 
"Cancer cells " "transformed" cells, or "transformation" in tissue culture, refers to 
spontaneous or induced phenotypic changes that do not necessarily involve the uptake of new 
genetic material Although transformation can arise from infection with a transforming virus 

15 and incorporation of new genomic DNA, or uptake of exogenous DNA, it can also arise 

spontaneously or following exposure to a carcinogen, thereby mutating an endogenous gene. 
Transformation is associated with phenotypic changes, such as immortalization of cells, 
aberrant growth control, nonmorphological changes, and/or malignancy (see, Freshney 
(1994) Culture of Animal Cells a Manual of Basic Technique (3 rd ed.)). 

20 "Antibody" refers to a polypeptide comprising a framework region from an 

immunoglobulin gene or fragments thereof that specifically binds and recognizes an antigen. 
The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, 
epsilon, and mu constant region genes, as well as the myriad immunoglobulin variable region 
genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as 

25 gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, 
IgM, IgA, IgD, and IgE, respectively. Typically, the antigen-binding region of an antibody or 
its functional equivalent will be most critical in specificity and affinity of binding. See Paul, 
Fundamental Immunology . 

An exemplary immunoglobulin (antibody) structural unit comprises a tetramer. Each 

30 tetramer is composed of two identical pairs of polypeptide chains, each pair having one 
"light* ■ (about 25 kD) and one "heavy" chain (about 50-70 kD). The N-terminus of each 
chain defines a variable region of about 100 to 110 or more amino acids primarily responsible 
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for antigen recognition. The terms variable light chain (V L ) and variable heavy chain (V h) 

refer to these light and heavy chains respectively. 

Antibodies exist, e.g., as intact immunoglobulins or as a number of well-characterized 

fragments produced by digestion with various peptidases. Thus, e.g., pepsin digests an 

5 antibody below the disulfide linkages in the hinge region to produce F(ab)' 2 , a dimer of Fab 

which itself is a light chain joined to V H -C H 1 by a disulfide bond. The F(ab)'2 may be 

reduced under mild conditions to break the disulfide linkage in the hinge region, thereby 

converting the F(ab)' 2 dimer into an Fab' monomer. The Fab' monomer is essentially Fab 

with part of the hinge region (see Paul (ed. 1999) Fundamental Immunology (4th ed.). While 

10 various antibody fragments are defined in terms of the digestion of an intact antibody, one of 
skill will appreciate that such fragments may be synthesized de novo either chemically or by 
using recombinant DNA methodology. Thus, the term antibody, as used herein, also includes 
antibody fragments either produced by the modification of whole antibodies, or those 
synthesized de novo using recombinant DNA methodologies (e.g., single chain Fv) or those 

15 identified using phage display libraries (see, e.g., McCafferty, et al. (1990) Nature 348:552- 
554). 

For preparation of antibodies, e.g., recombinant, monoclonal, or polyclonal 
antibodies, many technique known in the art can be used (see, e.g., Kohler and Milstein 
(1975) Nature 256:495-497; Kozbor, et al. (1983) Immunology Today 4:72; Cole, et al. 

20 (1985), pp. 77-96 in Monoclonal Antibodies and Cancer Therapy: Coligan (1991 and 
supplements) Current Protocols in Immunology: Harlow and Lane (1988) Antibodies, A 
Laboratory Manual: and Goding (1 986) Monoclonal Antibodies: Principles and Practice (2d 
ed.)). Techniques for the production of single chain antibodies (U.S. Patent 4,946,778) can 
be adapted to produce antibodies to polypeptides of this invention. Also, transgenic mice, or 

25 other organisms such as other mammals, may be used to express humanized antibodies. 
Alternatively, phage display technology can be used to identify antibodies and heteromeric 
Fab fragments that specifically bind to selected antigens (see, e.g., McCafferty, et al. (1990) 
Nature 348:552-554; Marks, et al. (1992) Biotechnology 10:779-783). 

A "chimeric antibody" is an antibody molecule in which, e.g, (a) the constant region, 

30 or a portion thereof, is altered, replaced, or exchanged so that the antigen binding site 
(variable region) is linked to a constant region of a different or altered class, effector 
function, and/or species, or an entirely different molecule which confers new properties to the 
chimeric antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the 
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variable region, or a portion thereof, is altered, replaced, or exchanged with a variable region 

having a different or altered antigen specificity. 



Identification of lung cancer-associated sequences 
5 In one aspect, the expression levels of genes are determined in different patient 

samples for which diagnosis information is desired, to provide expression profiles. An 
expression profile of a particular sample is essentially a "fingerprint" of the state of the 
sample; while two states may have any particular gene similarly expressed, the evaluation of 
a number of genes simultaneously allows the generation of a gene expression profile that is 

10 characteristic of the state of the cell. That is, normal tissue may be distinguished from 
cancerous or metastatic cancerous tissue, or metastatic cancerous tissue can be compared 
with tissue from surviving cancer patients. By comparing expression profiles of tissue in 
known different lung cancer states, information regarding which genes are important 
(including both up- and down-regulation of genes) in each of these states is obtained. 

1 5 Molecular profiling may distinguish subtypes of a currently collective disease designation, 
e.g., different forms of lung cancer (chronic disease, adenocarcinoma, etc.) 

The identification of sequences that are differentially expressed in lung cancer versus 
non-lung cancer tissue allows the use of this information in a number of ways. For example, 
a particular treatment regime may be evaluated: does a chemotherapeutic drug act to down- 

20 regulate lung cancer, and thus tumor growth or recurrence, in a particular patient. 

Alternatively, a treatment step may induce other markers which may be used as targets to 
destroy tumor cells. Similarly, diagnosis and treatment outcomes may be done or confirmed 
by comparing patient samples with the known expression profiles. Malignant diseasemay be 
compared to non-malignant conditions. Metastatic tissue can also be analyzed to determine 

25 the stage of lung cancer in the tissue, or origin of primary tumor, e.g., metastasis from a 

remote primary site. Furthermore, these gene expression profiles (or individual genes) allow 
screening of drug candidates with an eye to mimicking or altering a particular expression 
profile; e.g., screening can be done for drugs that suppress the lung cancer expression profile. 
This may be done by making biochips comprising sets of the important lung cancer genes, 

30 which can then be used in these screens. PCR methods may be applied with selected primer 
pairs, and analysis may be of RNA or of genomic sequences. These methods can also be 
done on the protein basis; that is, protein expression levels of the lung cancer proteins can be 
evaluated for diagnostic purposes or to screen candidate agents. In addition, the lung cancer 
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nucleic acid sequences can be administered for gene therapy purposes, including the 

administration of antisense nucleic acids, or the lung cancer proteins (including antibodies 

and other modulators thereof) administered as therapeutic drugs or as protein or DNA 

vaccines. 

5 Thus the present invention provides nucleic acid and protein sequences that are 

differentially expressed in lung cancer relative to normal tissues and/or non-malignant lung 
disease, or in different types of lung disease, herein termed "lung cancer sequences." As 
outlined below, lung cancer sequences include those that are up-regulated (i.e., expressed at a 
higher level) in lung cancer, as well as those that are down-regulated (i.e., expressed at a 

10 lower level). In a preferred embodiment, the lung cancer sequences are from humans; 
however, as will be appreciated by those in the art, lung cancer sequences from other 
organisms may be useful in animal models of disease and drug evaluation; thus, other lung 
cancer sequences are provided, from vertebrates, including mammals, including rodents (rats, 
mice, hamsters, guinea pigs, etc.), primates, farm animals (including sheep, goats, pigs, cows, 

15 horses, etc.) and pets (dogs, cats, etc.). Lung cancer sequences from other organisms may be 
obtained using the techniques outlined below. 

Lung cancer sequences can include both nucleic acid and amino acid sequences. As 
will be appreciated by those in the art and is more fully outlined below, lung cancer nucleic 
acid sequences are useful in a variety of applications, including diagnostic applications, 

20 which will detect naturally occurring nucleic acids, as well as screening applications; e.g., 
biochips comprising nucleic acid probes or PCR microtiter plates with selected probes to the 
lung cancer sequences can be generated. 

A lung cancer sequence can be initially identified by substantial nucleic acid and/or 
amino acid sequence homology to the lung cancer sequences outlined herein. Such 

25 homology can be based upon the overall nucleic acid or amino acid sequence, and is 

generally determined as outlined below, e.g., using homology programs or hybridization 
conditions. 

For identifying lung cancer-associated sequences, the lung cancer screen typically 
includes comparing genes identified in different tissues, e.g., normal and cancerous tissues, 
30 cancer and non-malignant conditions, non-malignant conditions and normal tissues, or tumor 
tissue samples from patients who have metastatic disease vs. non metastatic tissue. Other 
suitable tissue comparisons include comparing lung cancer samples with metastatic cancer 
samples from other cancers, such as, breast, other gastrointestinal cancers, prostate, ovarian, 
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etc. Samples of, non metastatic disease tissue and tissue undergoing metastasis are applied to 

biochips comprising nucleic acid probes. The samples are first microdissected, if applicable, 
and treated as is known in the art for the preparation of mRNA. Suitable biochips are 
commercially available, e.g., from Asymetrix, Santa Clara, CA. Gene expression profiles as 
5 described herein are generated and the data analyzed. 

In one embodiment, the genes showing changes in expression as between normal and 
disease states are compared to genes expressed in other normal tissues, preferably normal 
lung, but also including, and not limited to colon, heart, brain, liver, breast, kidney, muscle, 
prostate, small intestine, large intestine, spleen, bone, and/or placenta. In a preferred 
10 embodiment, those genes identified during the lung cancer screen that are expressed in 
significant amounts in other tissues (e.g., essential organs) are removed from the profile, 
although in some embodiments, this is not necessary (e.g., where organs may be dispensible 
at a later stage of life). That is, when screening for drugs, it is usually preferable that the 
target expression be disease specific, to minimize possible side effects on other organs. 
15 In a preferred embodiment, lung cancer sequences are those that are up-regulated in 

lung cancer; that is, the expression of these genes is higher in cancerous tissue than in normal 
lung or other tissue. 'Up-regulation" as used herein means, when the ratio is presented as a 
number greater than one, that the ratio is greater than one, preferably 1.5 or greater, more 
preferably 2.0 or greater. Another embodiment is directed to sequences up-regulated in non- 
20 malignant conditions relative to normal. Unigene cluster identification numbers and 

accession numbers herein are for the GenBank sequence database and the sequences of the 
accession numbers are hereby expressly incorporated by reference. GenBank is known in the 
art, see, e.g., Benson, DA, et al (1998) Nucleic Acids Research 26:1-7 and 
http://www.ncbi.nlm.nih.gov/. Sequences are also available in other databases, e.g., 
25 European Molecular Biology Laboratory (EMBL) and DNA Database of Japan (DDBJ). 
Another embodiment is directed to sequences up-regulated in non-malignant conditions 
relative to normal. In some situations, the sequences may be derived from assembly of 
available sequences or be predicted from genomic DNA using exon prediction algorithms, 
such as FGENESH (Salamov and Solovyev (2000) Genome Res. 10:516-522). In other 
30 situations, sequences have been derived from cloning and sequencing of isolated nucleic 
acids. 

In another preferred embodiment, lung cancer sequences are those that are dojvn- 
regulated in the lung cancer; that is, the expression of these genes is lower in cancerous tissue 
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or normal lung or other tissue. 'Down-regulation" as used herein means, when the ratio is 

presented as a number greater than one, that the ratio is greater than one, preferably 1 .5 or 

greater, more preferably 2.0 or greater, or, when the ratio is presented as a number less than 

one, that the ratio is less than one, preferably 0.5 or less, more preferably 0.25 or less. 

5 

Informatics 

The ability to identify genes that are over or under expressed in lung cancer can 
additionally provide high-resolution, high-sensitivity datasets which can be used in the areas 
of diagnostics, therapeutics, drug development, pharmacogenetics, protein structure, 

10 biosensor development, and other related areas. For example, the expression profiles can be 
used in diagnostic or prognostic evaluation of patients with lung cancer. Or as another 
example, subcellular toxicological information can be generated to better direct drug structure 
and activity correlation (see Anderson (1998) Pharmaceutical Proteomics: Targets, 
Mechanism, and Function, paper presented at the BBC Proteomics conference, Coronado, CA 

15 (June 1 1-12, 1998)). Subcellular toxicological information can also be utilized in a biological 
sensor device to predict the likely toxicological effect of chemical exposures and likely 
tolerable exposure thresholds (see U.S. Patent No. 5,811,231). Similar advantages accrue 
from datasets relevant to other biomolecules and bioactive agents (e.g., nucleic acids, 
saccharides, lipids, drugs, and the like). 

20 Thus, in another embodiment, the present invention provides a database that includes 

at least one set of assay data. The data contained in the database is acquired, e.g., using array 
analysis either singly or in a library format. The database can be in a form in which data can 
be maintained and transmitted, but is preferably an electronic database. The electronic 
database of the invention can be maintained on any electronic device allowing for the storage 

25 of and access to the database, such as a personal computer, but is preferably distributed on a 
wide area network, such as the World Wide Web. 

The focus of the present section on databases that include peptide sequence data is for 
clarity of illustration only. It will be apparent to those of skill in the art that similar databases 
can be assembled for assay data acquired using an assay of the invention. 

30 The compositions and methods for identifying and/or quantitating the relative and/or 

absolute abundance of a variety of molecular and macromolecular species from a biological 
sample representing lung cancer, i.e., the identification of lung cancer-associated sequences 
described herein, provide an abundance of information, which can be correlated with 
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pathological conditions, predisposition to disease, drug testing, therapeutic monitoring, gene- 
disease causal linkages, identification of correlates of immunity and physiological status, 
among others. Although the data generated from the assays of the invention is suited for 
manual review and analysis, in a preferred embodiment, data processing using high-speed 
5 computers is utilized. 

An array of methods for indexing and retrieving biomolecular information is known 
in the art. For example, U.S. Patents 6,023,659 and 5,966,712 disclose a relational database 
system for storing biomolecular sequence information in a manner that allows sequences to 
be catalogued and searched according to one or more protein function hierarchies. U.S. 

1 0 Patent 5,953,727 discloses a relational database having sequence records containing 
information in a format that allows a collection of partial-length DNA sequences to be 
catalogued and searched according to association with one or more sequencing projects for 
obtaining full-length sequences from the collection of partial length sequences. U.S. Patent 
5,706,498 discloses a gene database retrieval system for making a retrieval of a gene 

1 5 sequence similar to a sequence data item in a gene database based on the degree of similarity 
between a key sequence and a target sequence. U.S. Patent 5,538,897 discloses a method 
using mass spectroscopy fragmentation patterns of peptides to identify amino acid sequences 
in computer databases by comparison of predicted mass spectra with experimentally-derived 
mass spectra using a closeness-of-fit measure. U.S. Patent 5,926,818 discloses a multi- 

20 dimensional database comprising a functionality for multi-dimensional data analysis 
described as on-line analytical processing (OLAP), which entails the consolidation of 
projected and actual data according to more than one consolidation path or dimension. U.S. 
Patent 5,295,261 reports a hybrid database structure in which the fields of each database 
record are divided into two classes, navigational and informational data, with navigational 

25 fields stored in a hierarchical topological map which can be viewed as a tree structure or as 
the merger of two or more such tree structures. 

See also Mount, et al. (2001) Bioinformatics; Durbin, et al. (eds., 1999) Biological 
Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids (; Baxevanis and 
Oeullette (eds., 1998) Bioinformatics: A Practical Guide to the Analysi s of Genes and 

30 Proteins') : Rashidi and Buehler (1999) Bioinformatics: Basic Ap plications in Biological 
Science and Medicine: Setubal, et al. (eds 1997) Introduction t o Computational Molecular 
Biology : Misener and Krawetz (eds, 2000) Bioinformatic s: Methods and Protocols; Higgins 
and Taylor (eds., 2000) Bioinformatics: Sequence. Structure, and Databank s: A Practical 
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Approach: Brown (2001) Bioinformatics: A Biologist's Guide to Biocomput ing and the 

Internet: Han and Kamber (2000) Data Mining: Concents and Techniques (2000); and 

Waterman (1995) Introduction to Computational Biology : Maps. Sequences, and Genomes. 

The present invention provides a computer database comprising a computer and 

5 software for storing in computer-retrievable form assay data records cross-tabulated, e.g., 

with data specifying the source of the target-containing sample from which each sequence 

specificity record was obtained. 

In an exemplary embodiment, at least one of the sources of target-containing sample 

is from a control tissue sample known to be free of pathological disorders, hi a variation, at 

10 least one of the sources is a known pathological tissue specimen, e.g., a neoplastic lesion or 
another tissue specimen to be analyzed for lung cancer. In another variation, the assay 
records cross-tabulate one or more of the following parameters for each target species in a 
sample: (1) a unique identification code, which can include, e.g., a target molecular structure 
and/or characteristic separation coordinate (e.g., electrophoretic coordinates); (2) sample 

15 source; and (3) absolute and/or relative quantity of the target species present in the sample. 

The invention also provides for the storage and retrieval of a collection of target data 
in a computer data storage apparatus, which can include magnetic disks, optical disks, 
magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, magnetic 
bubble memory devices, and other data storage devices, including CPU registers and on-CPU 

20 data storage arrays. Typically, the target data records are stored as a bit pattern in an array of 
magnetic domains on a magnetizable medium or as an array of charge states or transistor gate 
states, such as an array of cells in a DRAM device (e.g., each cell comprised of a transistor 
and a charge storage area, which may be on the transistor). In one embodiment, the invention 
provides such storage devices, and computer systems built therewith, comprising a bit pattern 

25 encoding a protein expression fingerprint record comprising unique identifiers for at least 10 
target data records cross-tabulated with target source. 

When the target is a peptide or nucleic acid, the invention preferably provides a 
method for identifying related peptide or nucleic acid sequences, comprising performing a 
computerized comparison between a peptide or nucleic acid sequence assay record stored in 

30 or retrieved from a computer storage device or database and at least one other sequence. The 
comparison can include a sequence analysis or comparison algorithm or computer program 
embodiment thereof (e.g., FASTA, TFASTA, GAP, BESTFIT) and/or the comparison may 
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be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences 

determined from a polypeptide or nucleic acid sample of a specimen. 

The invention also preferably provides a magnetic disk, such as an IBM-compatible 

(DOS, Windows, Windows95/98/2000, Windows NT, OS/2) or other format (e.g., Linux, 

5 SunOS, Solaris, AIX, SCO Unix, VMS, MV, Macintosh, etc.) floppy diskette or hard (fixed, 

Winchester) disk drive, comprising a bit pattern encoding data from an assay of the invention 

in a file format suitable for retrieval and processing in a computerized sequence analysis, 

comparison, or relative quantitation method. 

The invention also provides a network, comprising a plurality of computing devices 

10 linked via a data link, such as an Ethernet cable (coax or lOBaseT), telephone line, ISDN 

line, wireless network, optical fiber, or other suitable signal transmission medium, whereby at 
least one network device (e.g., computer, disk array, etc.) comprises a pattern of magnetic 
domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM cells) 
composing a bit pattern encoding data acquired from an assay of the invention. 

15 The invention also provides a method for transmitting assay data that includes 

generating an electronic signal on an electronic communications device, such as a modem, 
ISDN terminal adapter, DSL, cable modem, ATM switch, or the like, wherein the signal 
includes (in native or encrypted format) a bit pattern encoding data from an assay or a 
database comprising a plurality of assay results obtained by the method of the invention. 

20 In a preferred embodiment, the invention provides a computer system for comparing a 

query target to a database containing an array of data structures, such as an assay result 
obtained by the method of the invention, and ranking database targets based on the degree of 
identity and gap weight to the target data. A central processor is preferably initialized to load 
and execute the computer program for alignment and/or comparison of the assay results. 

25 Data for a query target is entered into the central processor via an I/O device. Execution of 
the computer program results in the central processor retrieving the assay data from the data 
file, which comprises a binary description of an assay result. 

The target data or record and the computer program can be transferred to secondary 
memory, which is typically random access memory (e.g., DRAM, SRAM, SGRAM, or 

30 SDRAM). Targets are ranked according to the degree of correspondence between a selected 
assay characteristic (e.g., binding to a selected affinity moiety) and the same characteristic of 
the query target and results are output via an I/O device. For example, a central processor 
can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, PA-8000, SPARC, 
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MIPS 4400, MEPS 10000, VAX, etc.); a program can be a commercial or public domain 

molecular biology software package (e.g., UWGCG Sequence Analysis Software, Darwin); a 

data file can be an optical or magnetic disk, a data server, a memory device (e.g., DRAM, 

SRAM, SGRAM, SDRAM, EPROM, bubble memory, flash memory, etc.); an I/O device can 

5 be a terminal comprising a video display and a keyboard, a modem, an ISDN terminal 

adapter, an Ethernet port, a punched card reader, a magnetic strip reader, or other suitable I/O 

device. 

The invention also preferably provides the use of a computer system, such as that 
described above, which comprises: (1) a computer; (2) a stored bit pattern encoding a 
10 collection of peptide sequence specificity records obtained by the methods of the invention, 
which may be stored in the computer; (3) a comparison target, such as a query target; and (4) 
a program for alignment and comparison, typically with rank-ordering of comparison results 
on the basis of computed similarity values. 

1 5 Characteristics of lung cancer-associated proteins 

Lung cancer proteins of the present invention may be classified as secreted proteins, 
transmembrane proteins or intracellular proteins. In one embodiment, the lung cancer protein 
is an intracellular protein. Intracellular proteins may be found in the cytoplasm and/or in the 
nucleus. Intracellular proteins are involved in all aspects of cellular function and replication 

20 (including, e.g., signaling pathways); aberrant expression of such proteins often results in 
unregulated or disregulated cellular processes (see, e.g., Alberts (ed. 1994) Molecular 
Biology of the Cell (3d ed.). For example, many intracellular proteins have enzymatic 
activity such as protein kinase activity, protein phosphatase activity, protease activity, 
nucleotide cyclase activity, polymerase activity and the like. Intracellular proteins also serve 

25 as docking proteins that are involved in organizing complexes of proteins, or targeting 

proteins to various subcellular localizations, and are involved in maintaining the structural 
integrity of organelles. 

An increasingly appreciated concept in characterizing proteins is the presence in the 
proteins of one or more structural motifs for which defined functions have been attributed. In 

30 addition to the highly conserved sequences found in the enzymatic domain of proteins, highly 
conserved sequences have been identified in proteins that are involved in protein-protein 
interaction. For example, Src-homology-2 (SH2) domains bind tyrosine-phosphorylated 
targets in a sequence dependent manner. PTB domains, which are distinct from SH2 
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domains, also bind tyrosine phosphorylated targets. SH3 domains bind to proline-rich 

targets. In addition, PH domains, tetratricopeptide repeats and WD domains to name only a 

few, have been shown to mediate protein-protein interactions. Some of these may also be 

involved in binding to phospholipids or other second messengers. As will be appreciated by 

5 one of ordinary skill in the art, these motifs can be identified on the basis of amino acid 

sequence; thus, an analysis of the sequence of proteins may provide insight into both the 

enzymatic potential of the molecule and/or molecules with which the protein may associate. 

One usefid database is Pfam (protein families), which is a large collection of multiple 

sequence alignments and hidden Markov models covering many common protein domains, 

10 Versions are available via the internet from Washington University in St. Louis, the Sanger 

Center in England, and the Karolinska Institute in Sweden (see, e.g., Bateman, et al (2000) 

Nuc. Acids Res. 28:263-266; Sonnhammer, et al. (1997) Proteins 28:405-420; Bateman, et al. 

(1999) Nuc. Acids Res. 27:260-262; and Sonnhammer, et al. (1998) Nuc. Acids Res. 26:320- 

322). 

15 In another embodiment, the lung cancer sequences are transmembrane proteins. 

Transmembrane proteins are molecules that span a phospholipid bilayer of a cell. They may 
have an intracellular domain, an extracellular domain, or both. The intracellular domains of 
such proteins may have a number of functions including those already described for 
intracellular proteins. For example, the intracellular domain may have enzymatic activity 

20 and/or may serve as a binding site for additional proteins. Frequently the intracellular 

domain of transmembrane proteins serves both roles. For example certain receptor tyrosine 
kinases have both protein kinase activity and SH2 domains. In addition, autophosphorylation 
of tyrosines on the receptor molecule itself, creates binding sites for additional SH2 domain 
containing proteins. 

25 Transmembrane proteins may contain from one to many transmembrane domains. 

For example, receptor tyrosine kinases, certain cytokine receptors, receptor guanylyl cyclases 
and receptor serine/threonine protein kinases contain a single transmembrane domain. 
However, various other proteins including channels, pumps, and adenylyl cyclases contain 
numerous transmembrane domains. Many important cell surface receptors such as G protein 

30 coupled receptors (GPCRs) are classified as "seven transmembrane domain" proteins, as they 
contain 7 membrane spanning regions. Characteristics of transmembrane domains include 
approximately 17 consecutive hydrophobic amino acids that may be followed by charged 
amino acids. Therefore, upon analysis of the amino acid sequence of a particular protein, the 
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localization and number of transmembrane domains within the protein may be predicted (see, 

e.g., PSORT web site http://psort.nibb.ac.jp/). 

The extracellular domains of transmembrane proteins are diverse; however, conserved 

motifs are found repeatedly among various extracellular domains. Conserved structure 

5 and/or functions have been ascribed to different extracellular motifs. Many extracellular 

domains are involved in binding to other molecules. In one aspect, extracellular domains are 

found on receptors. Factors that bind the receptor domain include circulating ligands, which 

may be peptides, proteins, or small molecules such as adenosine and the like. For example, 

growth factors such as EGF, FGF, and PDGF are circulating growth factors that bind to their 

10 cognate receptors to initiate a variety of cellular responses. Other factors include cytokines, 
mitogenic factors, hormones, neurotrophic factors and the like. Extracellular domains also 
bind to cell-associated molecules. In this respect, they may mediate cell-cell interactions. 
Cell-associated ligands can be tethered to the cell, e.g., via a glycosylphosphatidylinositol 
(GPI) anchor, or may themselves be transmembrane proteins. Extracellular domains may 

15 also associate with the extracellular matrix and contribute to the maintenance of the cell 
structure. 

Lung cancer proteins that are transmembrane are particularly preferred in the present 
invention as they are readily accessible targets for extracellular immunotherapeutics, as are 
described herein. In addition, as outlined below, transmembrane proteins can be also useM 

20 in imaging modalities. Antibodies may be used to label such readily accessible proteins in 
situ or in histological analysis. Alternatively, antibodies can also label intracellular proteins, 
in which case analytical samples are typically permeablized to provide access to intracellular 
proteins. In addition, some membrane proteins can be processed to release a soluble protein, 
or to expose a residual fragment. Released soluble proteins may be useful diagnostic 

25 markers, processed residual protein fragments may be useful lung markers of disease. 

It will also be appreciated by those in the art that a transmembrane protein can be 
made soluble by removing transmembrane sequences, e.g., through recombinant methods. 
Furthermore, transmembrane proteins that have been made soluble can be made to be 
secreted through recombinant means by adding an appropriate signal sequence. 

30 In another embodiment, the lung cancer proteins are secreted proteins; the secretion of 

which can be either constitutive or regulated. These proteins may have a signal peptide or 
signal sequence that targets the molecule to the secretory pathway. Secreted proteins are 
involved in numerous physiological events; e.g., if circulating, they often serve to transmit 
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signals to various other cell types. The secreted protein may function in an autocrine manner 

(acting on the cell that secreted the factor), a paracrine manner (acting on cells in close 

proximity to the cell that secreted the factor), an endocrine manner (acting on cells at a 

distance, e.g., secretion into the blood stream), or exocrine (secretion, e.g., through a duct or 

5 to adjacent epithelial surface as sweat glands, sebaceous glands, pancreatic ducts, lacrimal 

glands, mammary glands, sax producing glands of the ear, etc.). Thus secreted molecules 

often find use in modulating or altering numerous aspects of physiology. Lung cancer 

proteins that are secreted proteins are particularly preferred in the present invention as they 

serve as good targets for diagnostic markers, e.g., for blood, plasma, serum, or stool tests. 

10 Those which are enzymes may be antibody or small molecule targets. Others may be useful 

as vaccine targets, e.g., via CTL mechanisms. 

Use of lung cancer nucleic acids 

As described above, lung cancer sequence is initially identified by substantial nucleic 
15 acid and/or amino acid sequence homology or linkage to the lung cancer sequences outlined 
herein. Such homology can be based upon the overall nucleic acid or amino acid sequence, 
and is generally determined as outlined below, using either homology programs or 
hybridization conditions. Typically, linked sequences on a iriRNA are found on the same 
molecule. 

20 The lung cancer nucleic acid sequences of the invention, e.g., the sequences in Tables 

1A-16, can be fragments of larger genes, i.e., they are nucleic acid segments. "Genes" in this 
context includes coding regions, non-coding regions, and mixtures of coding and non-coding 
regions. Accordingly, as will be appreciated by those in the art, using the sequences provided 
. herein, extended sequences, in either direction, of the lung cancer genes can be obtained, 

25 using techniques well known in the art for cloning either longer sequences or the full length 
sequences; see Ausubel, et al., supra. Much can be done by informatics and many sequences 
can be clustered to include multiple sequences corresponding to a single gene, e.g., systems 
such as UniGene (see, http://www.ncbi.nlm.nih.gov/UniGene/). 

Once a lung cancer nucleic acid is identified, it can be cloned and, if necessary, its 

30 constituent parts recombined to form the entire lung cancer nucleic acid coding regions or the 
entire mRNA sequence. Once isolated from its natural source, e.g., contained within a 
plasmid or other vector or excised therefrom as a linear nucleic acid segment, the 
recombinant lung cancer nucleic acid can be further-used as a probe to identify and isolate 
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other lung cancer nucleic acids, e.g., extended coding regions. It can also be used as a 

precursor" nucleic acid to make modified or variant lung cancer nucleic acids and proteins. 

The lung cancer nucleic acids of the present invention are used in several ways. In a 

first embodiment, nucleic acid probes to the lung cancer nucleic acids are made and attached 

5 to biochips to be used in screening and diagnostic methods, as outlined below, or for 

administration, e.g., for gene therapy, RNAi, vaccine, and/or antisense applications. 

Alternatively, the lung cancer nucleic acids that include coding regions of lung cancer 

proteins can be put into expression vectors for the expression of lung cancer proteins, again 

for screening purposes or for administration to a patient. 

10 In a preferred embodiment, nucleic acid probes to lung cancer nucleic acids (both the 

nucleic acid sequences outlined in the figures and/or the complements thereof) are made. 
The nucleic acid probes attached to the biochip are designed to be substantially 
complementary to the lung cancer nucleic acids, i.e., the target sequence (either the target 
sequence of the sample or to other probe sequences, e.g., in sandwich assays), such that 

15 hybridization of the target sequence and the probes of the present invention occurs. As 

outlined below, this complementarity need not be perfect; there may be any number of base 
pair mismatches which will interfere with hybridization between the target sequence and the 
single stranded nucleic acids of the present invention. However, if the number of mutations 
is so great that no hybridization can occur under even the least stringent of hybridization 

20 conditions, the sequence is not a complementary target sequence. Thus, by "substantially 
complementary" herein is meant that the probes are sufficiently complementary to the target 
sequences to hybridize under appropriate reaction conditions, particularly high stringency 
conditions, as outlined herein. 

A nucleic acid probe is generally single stranded but can be partially single and 

25 partially double stranded. The strandedness of the probe is dictated by the structure, 

composition, and properties of the target sequence. In general, the nucleic acid probes range 
from about 8 to about 100 bases long, with from about 10 to about 80 bases being preferred, 
and from about 30 to about 50 bases being particularly preferred. That is, generally 
complements of ORFs or whole genes are not used. In some embodiments, nucleic acids of 

30 lengths up to hundreds of bases can be used. 

In a preferred embodiment, more than one probe per sequence is used, with either 
overlapping probes or probes to different sections of the target being used. That is, two, 
three, four or more probes, with three being preferred, are used to build in a redundancy for a 
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particular target. The probes can be overlapping (i.e., have some sequence in common), or 

separate. In some cases, PCR primers may be used to amplify signal for higjher sensitivity. 

As will be appreciated by those in the art, nucleic acids can be attached or 

immobilized to a solid support in a wide variety of ways. By "immobilized" and grammatical 

5 equivalents herein is meant the association or binding between the nucleic acid probe and the 

solid support is sufficient to be stable under the conditions of binding, washing, analysis, and 

removal as outlined below. The binding can typically be covalent or non-covalent. By "non- 

covalent binding" and grammatical equivalents herein is typically meant one or more of 

electrostatic, hydrophilic, and hydrophobic interactions. Included in non-covalent binding is 

10 the covalent attachment of a molecule, such as, streptavidin to the support and the non- 
covalent binding of the biotinylated probe to the streptavidin. By "covalent binding" and 
grammatical equivalents herein is meant that the two moieties, the solid support and the 
probe, are attached by at least one bond, including sigma bonds, pi bonds and coordination 
bonds. Covalent bonds can be formed directly between the probe and the solid support or can 

15 be formed by a cross linker or by inclusion of a specific reactive group on either the solid 
support or the probe or both molecules. Immobilization may also involve a combination of 
covalent and non-covalent interactions. 

In general, the probes are attached to a biochip in a wide variety of ways, as will be 
appreciated by those in the art. As described herein, the nucleic acids can either be 

20 synthesized first, with subsequent attachment to the biochip, or can be directly synthesized on 
the biochip. 

The biochip comprises a suitable solid substrate. By "substrate" or "solid support" or 
other grammatical equivalents herein is meant a material that can be modified for the 
attachment or association of the nucleic acid probes and is amenable to at least one detection 

25 method. Often the substrate may contain discrete individual sites appropriate for ndivitual 
partitioning and identification. As will be appreciated by those in the art, the number of 
possible substrates are very large, and include, but are not limited to, glass and modified or 
functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and 
other materials, polypropylene, polyethylene, polybutylene, polyurethanes, Teflon, etc.), 

30 polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including 
silicon and modified silicon, carbon, metals, inorganic glasses, plastics, etc. In general, the 
substrates allow optical detection and do not appreciably fluoresce. A preferred substrate is 
described in US application entitled Reusable Low Fluorescent Plastic Biochip, U.S. 
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Application Serial No. 09/270,214, filed March 15, 1999, herein incorporated by reference in 

its entirety. 

Generally the substrate is planar, although as will be appreciated by those in the art, 
other configurations of substrates may be used as well. For example, the probes may be 
5 placed on the inside surface of a tube, for flow-through sample analysis to minimize sample 
volume. Similarly, the substrate may be flexible, such as a flexible foam, including closed 
cell foams made of particular plastics. 

In a preferred embodiment, the surface of the biochip and the probe may be 
derivatized with chemical functional groups for subsequent attachment of the two. Thus, e.g., 

10 the biochip is derivatized with a chemical functional group including, but not limited to, 
amino groups, carboxy groups, oxo groups and thiol groups, with amino groups being 
particularly preferred. Using these functional groups, the probes can be attached using 
functional groups on the probes. For example, nucleic acids containing amino groups can be 
attached to surfaces comprising amino groups, e.g., using linkers as are known in the art; e.g., 

1 5 homo-or hetero-bifunctional linkers as are well known (see 1 994 Pierce Chemical Company 
catalog, technical section on cross-linkers, pages 155-200). In addition, in some cases, 
additional linkers, such as alkyl groups (including substituted and heteroalkyl groups) may be 
used. 

In this embodiment, oligonucleotides are synthesized, and then attached to the surface 
20 of the solid support. Either the 5' or 3 9 terminus may be attached to the solid support, or 

attachment may be via linkage to an internal nucleoside. 

In another embodiment, the immobilization to the solid support may be very strong, 

yet non-covalent For example, biotinylated oligonucleotides can be made, which bind to 

surfaces covalently coated with streptavidin, resulting in attachment. 
25 Alternatively, the oligonucleotides may be synthesized on the surface, as is known in 

the art. For example, photoactivation techniques utilizing photopolymerization compounds 

and techniques are used. In a preferred embodiment, the nucleic acids can be synthesized in 

situ, using known photolithographic techniques, such as those described in WO 95/251 16; 

WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references cited within, all of 
30 which are expressly incorporated by reference; these methods of attachment form the basis of 

the Affymetrix GeneChip™ technology. 

Often, amplification-based assays are performed to measure the expression level of 

lung cancer-associated sequences. These assays are typically performed in conjunction with 
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reverse transcription. In such assays, a lung cancer-associated nucleic acid sequence acts as a 

template in an amplification reaction (e.g., Polymerase Chain Reaction, or PCR). In a 

quantitative amplification, the amount of amplification product will be proportional to the 

amount of template in the original sample. Comparison to appropriate controls provides a 

5 measure of the amount of lung cancer-associated RNA. Methods of quantitative 

amplification are well known to those of skill in the art. Detailed protocols for quantitative 

PCR are provided, e.g., in Innis, et al. (1990) PCR Protocols. A Guide to Methods and 

Applications . 

In some embodiments, a TaqMan based assay is used to measure expression. 

10 TaqMan based assays use a fluorogenic oligonucleotide probe that contains a 5' fluorescent 
dye and a 3' quenching agent. The probe hybridizes to a PCR product, but cannot itself be 
extended due to a blocking agent at the 3 ' end. When the PCR product is amplified in 
subsequent cycles, the 5' nuclease activity of the polymerase, e.g., AmpliTaq, results in the 
cleavage of the TaqMan probe. This cleavage separates the 5' fluorescent dye and the 3' 

1 5 quenching agent, thereby resulting in an increase in fluorescence as a function of 

amplification (see, e.g., literature provided by Perkin-Elmer, e.g., www2.perkin-elmer.com). 

Other suitable amplification methods include, but are not limited to, ligase chain 
reaction (LCR) (see Wu and Wallace (1989) Genomics 4:560, Landegren, et al. (1988) 
Science 241:1077, and Bairinger, et al. (1990) Gene 89:117), transcription amplification 

20 (Kwoh, et al. (1989) Proc. Natl. Acad. Sci. USA 86:1 173), self-sustained sequence 

replication (Guatelli, et al. (1990) Proc. Nat. Acad. Sci. USA 87:1874), dot PCR, and linker 
adapter PCR, etc. 

Expression of lung cancer proteins from nucleic acids 

25 In a preferred embodiment,, lung cancer nucleic acids, e.g., encoding lung cancer 

proteins, are used to make a variety of expression vectors to express lung cancer proteins 
which can then be used in screening assays, as described below. Expression vectors and 
recombinant DNA technology are well known to those of skill in the art (see, e.g., Ausubel, 
supra, and Fernandez and Hoeffler (eds 1999) Gene Expression Systems) and are used to 

30 express proteins. The expression vectors may be either self-replicating extrachromosomal 
vectors or vectors which integrate into a host genome. Generally, these expression vectors 
include transcriptional and translational regulatory nucleic acid operably linked to the nucleic 
acid encoding the lung cancer protein. The term "control sequences" refers to DNA 
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sequences used for the expression of an operably linked coding sequence in a particular host 

organism. Control sequences that are suitable for prokaryotes, e.g., include a promoter, 

optionally an operator sequence, and a ribosome binding site. Eukaryotic cells are known to 

utilize promoters, polyadenylation signals, and enhancers. 

5 Nucleic acid is "operably linked" when it is placed into a functional relationship with 

another nucleic acid sequence. For example, DNA for a presequence or secretory leader is 

operably linked to DNA for a polypeptide if it is expressed as a preprotein that participates in 

the secretion of the polypeptide; a promoter or enhancer is operably linked to a coding 

sequence if it affects the transcription of the sequence; or a ribosome binding site is operably 

10 linked to a coding sequence if it is positioned so as to facilitate translation. Generally, 
"operably linked" means that the DNA sequences being linked are contiguous, and, in the 
case of a secretory leader, contiguous and in reading phase. However, enhancers do not have 
to be contiguous. Linking is typically accomplished by ligation at convenient restriction 
sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are used in 

1 5 accordance with conventional practice. Transcriptional and translational regulatory nucleic 
acid will generally be appropriate to the host cell used to express the lung cancer protein. 
Numerous types of appropriate expression vectors, and suitable regulatory sequences are 
known in the art for a variety of host cells. 

In general, transcriptional and translational regulatory sequences may include, but are 

20 not limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop 
sequences, translational start and stop sequences, and enhancer or activator sequences. In a 
preferred embodiment, the regulatory sequences include a promoter and transcriptional start 
and stop sequences. 

Promoter sequences may be either constitutive or inducible promoters. The promoters 
25 may be either naturally occurring promoters or hybrid promoters. Hybrid promoters, which 
combine elements of more than one promoter, are also known in the art, and are useful in the 
present invention. 

In addition, an expression vector may comprise additional elements. For example, the 
expression vector may have two replication systems, thus allowing it to be maintained in two 
30 organisms, e.g., in mammalian or insect cells for expression and in a prokaryotic host for 
cloning and amplification. Furthermore, for integrating expression vectors, the expression 
vector often contains at least one sequence homologous to the host cell genome, and 
preferably two homologous sequences which flank the expression construct. The integrating 
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vector may be directed to a specific locus in the host cell by selecting the appropriate 

homologous sequence for inclusion in the vector. Constructs for integrating vectors are well 

known in the art (e.g., Fernandez and Hoeffler, supra). 

In addition, in a preferred embodiment, the expression vector contains a selectable 

5 marker gene to allow the selection of transformed host cells. Selection genes are well known 

in the art and will vary with the host cell used. 

The lung cancer proteins of the present invention are usually produced by culturing a 

host cell transformed with an expression vector containing nucleic acid encoding a lung 

cancer protein, under the appropriate conditions to induce or cause expression of the lung 

1 0 cancer protein. Conditions appropriate for lung cancer protein expression will vary with the 
choice of the expression vector and the host cell, and will be easily ascertained by one skilled 
in the art through routine experimentation or optimization. For example, the use of 
constitutive promoters in the expression vector will require optimizing the growth and 
proliferation of the host cell, while the use of an inducible promoter requires the appropriate 

15 growth conditions for induction. In addition, in some embodiments, the timing of the harvest 
is important. For example, the baculoviral systems used in insect cell expression are lytic 
viruses, and thus harvest time selection can be crucial for product yield. 

Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect and 
animal cells, including mammalian cells. Of particular interest are Saccharomyces cerevisiae 

20 and other yeasts, E. coli, Bacillus sub t His, Sf9 cells, C129 cells, 293 cells, Neurospora, BHK, 
CHO, COS, HeLa cells, HUVEC (human umbilical vein endothelial cells), THP1 cells (a 
macrophage cell line) and various other human cells and cell lines. 

In a preferred embodiment, the lung cancer proteins are expressed in mammalian 
cells. Mammalian expression systems are also known in the art, and include retroviral and 

25 adenoviral systems. Of particular use as mammalian promoters are the promoters from 
mammalian viral genes, since the viral genes are often highly expressed and have a broad 
host range. Examples include the S V40 early promoter, mouse mammary tumor virus LTR 
promoter, adenovirus major late promoter, herpes simplex virus promoter, and the CMV 
promoter (see, e.g., Fernandez and Hoeffler, supra). Typically, transcription termination and 

30 polyadenylation sequences recognized by mammalian cells are regulatory regions located 3' 
to the translation stop codon and thus, together with the promoter elements, flank the coding 
sequence. Examples of transcription terminator and polyadenylation signals include those 
derived form SV40. 
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The methods of introducing exogenous nucleic acid into mammalian hosts, as well as 
other hosts, is well known in the art, and will vary with the host cell used. Techniques 
include dextran-mediated transfection, calcium phosphate precipitation, polybrene mediated 
transfection, protoplast fusion, electroporation, viral infection, encapsulation of the 
5 polynucleotide(s) in liposomes, and direct microinjection of the DNA into nuclei. 

In a preferred embodiment, lung cancer proteins are expressed in bacterial systems. 
Promoters from bacteriophage may also be used and are known in the art. In addition, 
synthetic promoters and hybrid promoters are also useful; e.g., the tac promoter is a hybrid of 
the tip and lac promoter sequences. Furthermore, a bacterial promoter can include naturally 

10 occumng promoters of non-bacterial origin that have the ability to bind bacterial RNA 
polymerase and initiate transcription. In addition to a functioning promoter sequence, an 
efficient ribosome binding site is desirable. The expression vector may also include a signal 
peptide sequence that provides for secretion of the lung cancer protein in bacteria. The 
protein is either secreted into the growth media (gram-positive bacteria) or into the 

15 periplasmic space, located between the inner and outer membrane of the cell (gram-negative 
bacteria). The bacterial expression vector may also include a selectable marker gene to allow 
for the selection of bacterial strains that have been transformed. Suitable selection genes 
include genes which render the bacteria resistant to drugs such as ampicillin, 
chloramphenicol, erythromycin, kanamycin, neomycin and tetracycline. Selectable markers 

10 also include biosynthetic genes, such as those in the histidine, tryptophan and leucine 

biosynthetic pathways. These components are assembled into expression vectors. Expression 
vectors for bacteria are well known in the art, and include vectors for Bacillus subtilis, E. 
coli, Streptococcus cremoris, and Streptococcus lividans, among others (e.g., Fernandez and 
Hoeffler, supra). The bacterial expression vectors are transformed into bacterial host cells 

15 using techniques well known in the art, such as calcium chloride treatment, electroporation, 
and others. 

In one embodiment, lung cancer proteins are produced in insect cells. Expression 
vectors for the transformation of insect cells, and in particular, baculovirus-based expression 
vectors, are well known in the art. 
50 In a preferred embodiment, lung cancer protein is produced in yeast cells. Yeast 

expression systems are well known in the art, and include expression vectors for 
Saccharomyces cerevisiae, Candida albicans and G maltosa, Hansenula polymorpha, 
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Kluyveromyces fragilis and K. lactis, Pichia guillerimondii, and P. pastoris, 

Schizosaccharomyces pombe, and Yarrowia lipolytica. 

The lung cancer protein may also be made as a fusion protein, using techniques well 

known in the art. Thus, e.g., for the creation of monoclonal antibodies, if the desired epitope 

5 is small, the lung cancer protein may be fused to a carrier protein to form an immunogen. 

Alternatively, the lung cancer protein may be made as a fusion protein to increase expression 

for affinity purification purposes, or for other reasons. For example, when the lung cancer 

protein is a lung cancer peptide, the nucleic acid encoding the peptide may be linked to other 

nucleic acid for expression purposes. 

10 In a preferred embodiment, the lung cancer protein is purified or isolated after 

expression. Lung cancer proteins may be isolated or purified in a variety of appropriate 
ways. Standard purification methods include electrophoretic,, molecular, immunological and 
chromatographic techniques, including ion exchange, hydrophobic, affinity, and reverse- 
phase HPLC chromatography, and chromatofocusing. For example, the lung cancer protein 

15 may be purified using a standard anti-lung cancer protein antibody column. Ultrafiltration 
and diafiltration techniques, in conjunction with protein concentration, are also useful. For 
general guidance in suitable purification techniques, see Scopes (1982) Protein Purification . 
The degree of purification necessary will vary depending on the use of the lung cancer 
protein. In some instances no purification will be necessary. 

20 Once expressed and purified if necessary, the lung cancer proteins and nucleic acids 

are useful in a number of applications. They may be used as immunoselection reagents, as 
vaccine reagents, as screening agents, therapeutic entities, for production of antibodies, as 
transcription or translation inhibitors, etc. 

25 Variants of lung cancer proteins 

In one embodiment, the lung cancer proteins are derivative or variant lung cancer 
proteins as compared to the wild-type sequence. That is, as outlined more fully below, the 
derivative lung cancer peptide will often contain at least one amino acid substitution, deletion 
or insertion, with amino acid substitutions being particularly preferred. The amino acid 
30 substitution, insertion or deletion may occur at a particular residue within the lung cancer 
peptide. 

Also included within one embodiment of lung cancer proteins of the present invention 
are amino acid sequence variants. These variants typically fall into one or more of three 
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classes: substitutional, insertional or deletional variants. These variants ordinarily are 

prepared by site specific mutagenesis of nucleotides in the DNA encoding the lung cancer 

protein, using cassette or PCR mutagenesis or other techniques, to produce DNA encoding 

the variant, and thereafter expressing the DNA in recombinant cell culture as outlined above. 

5 However, variant lung cancer protein fragments having up to about 100-150 residues may be 

prepared by in vitro synthesis. Amino acid sequence variants are characterized by the 

predetermined nature of the variation, a feature that sets them apart from naturally occurring 

allelic or interspecies variation of the lung cancer protein amino acid sequence. The variants 

typically exhibit a similar qualitative biological activity as the naturally occurring analogue, 

10 although variants can also be selected which have modified characteristics as will be more 
fully outlined below. 

While the site or region for introducing an amino acid sequence variation is often 
predetermined, the mutation per se need not be predetermined. For example, in order to 
optimize the performance of a mutation at a given site, random mutagenesis may be 

15 conducted at the target codon or region and the expressed lung cancer variants screened for 
the optimal combination of desired activity. Techniques exist for making substitution 
mutations at predetermined sites in DNA having a known sequence, e.g., Ml 3 primer 
mutagenesis and PCR mutagenesis. Screening of mutants is often done using assays of lung 
cancer protein activities. 

20 Amino acid substitutions are typically of single residues; insertions usually will be on 

the order of from about 1 to 20 amino acids, although considerably larger insertions may be 
occasionally tolerated. Deletions generally range from about 1 to about 20 residues, although 
in some cases deletions may be much larger. 

Substitutions, deletions, insertions or any combination thereof may be used to arrive 

25 at a final derivative. Generally these changes are done on a few amino acids to minimize the 
alteration of the molecule. Larger changes may be tolerated in certain circumstances. When 
small alterations in the characteristics of a lung cancer protein are desired, substitutions are 
generally made in accordance with the amino acid substitution chart provided in the 
definition section. 

30 Variants typically exhibit essentially the same qualitative biological activity and will 

elicit the same immune response as a naturally-occurring analog, although variants also are 
selected to modify the characteristics of lung cancer proteins as needed. Alternatively, the 
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variant may be designed or reorganized such that a biological activity of the lung cancer 

protein is altered. For example, glycosylation sites may be added, altered, or removed. 

Covalent modifications of lung cancer polypeptides are included within the scope of 

this invention. One type of covalent modification includes reacting targeted amino acid 

5 residues of a lung cancer polypeptide with an organic derivatizing agent that is capable of 

reacting with selected side chains or the N-or C-terminal residues of a lung cancer 

polypeptide. Derivatization with bifunctional agents is useful, for instance, for crosslinking 

lung cancer polypeptides to a water-insoluble support matrix or surface for use in a method 

for purifying anti-lung cancer polypeptide antibodies or screening assays, as is more fully 

10 described below. Commonly used crosslinking agents include, e.g., l,l-bis(diazoacetyl)-2- 
phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, e.g., esters with 4-azidosalicylic 
acid, homobifunctional imidoesters, including disuccinimidyl esters such as 3,3'- 
dithiobis(succinimidylpropionate), bifunctional maleimides such as bis-N-maleimido-1,8- 
octane and agents such as methyl-3-((p-azidophenyl)dithio)propipimidate. 

1 5 Other modifications include deamidation of glutaminyl and asparaginyl residues to 

the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of proline and 
lysine, phosphorylation of hydroxyl groups of serinyl, threonyl or tyrosyl residues, 
methylation of the y-amino groups of lysine, arginine, and histidine side chains (Creighton 
(1983) Proteins: Structure and Molecular Properties, pp. 79-86), acetylation of the N-terminal 

20 amine, and amidation of any C-terminal carboxyl group. 

Another type of covalent modification of the lung cancer polypeptide encompassed by 
this invention is an altered native glycosylation pattern of the polypeptide. "Altering the 
native glycosylation pattern" is intended herein to mean adding to or deleting one or more 
carbohydrate moieties of a native sequence lung cancer polypeptide. Glycosylation patterns 

25 can be altered in many ways. For example the use of different cell types to express lung 
cancer-associated sequences can result in different glycosylation patterns. 

Addition of glycosylation sites to lung cancer polypeptides may also be accomplished 
by altering the amino acid sequence thereof. The alteration may be made, e.g., by the 
addition of, or substitution by, one or more serine or threonine residues to the native sequence 

30 lung cancer polypeptide (for O-linked glycosylation sites). The lung cancer amino acid 
sequence may optionally be altered through changes at the DNA level, particularly by 
mutating the DNA encoding the lung cancer polypeptide at preselected bases such that 
codons are generated that will translate into the desired amino acids. 
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Another means of increasing the number of carbohydrate moieties on the lung cancer 

polypeptide is by chemical or enzymatic coupling of glycosides to the polypeptide. Such 

methods are described in the art, e.g., in WO 87/05330, and in Aplin and Wriston (1981) 

CRC Crit. Rev. Biochem.. pp. 259-306. 

5 Removal of carbohydrate moieties present on the lung cancer polypeptide may be 

accomplished chemically or enzymatically or by mutational substitution of codons encoding 

for amino acid residues that serve as targets for glycosylation. Chemical deglycosylation 

techniques are known in the art and described, for instance, by Hakimuddin, et al. (1987) 

Arch. Biochem. Biophvs.. 259:52 and by Edge, et al. (1981) Anal. Biochem., 118:131. 

10 Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the use of a 

variety of endo-and exo-glycosidases as described by Thotakura, et al. (1987) Meth. 

Enzvmol.. 138:350. 

Another type of covalent modification of lung cancer comprises linking the lung 
cancer polypeptide to one of a variety of nonproteinaceous polymers, e.g., polyethylene 

15 glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth in U.S. Patent 
Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192, or 4,179,337. 

Lung cancer polypeptides of the present invention may also be modified in a way to 
form chimeric molecules comprising a lung cancer polypeptide fused to another, 
heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric 

20 molecule comprises a fusion of a lung cancer polypeptide with a tag polypeptide which 
provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is 
generally placed at the amino-or carboxyl-terminus of the lung cancer polypeptide. The 
presence of such epitope-tagged forms of a lung cancer polypeptide can be detected using an 
antibody against the tag polypeptide. Also, provision of the epitope tag enables the lung 

25 cancer polypeptide to be readily purified by affinity purification using an anti-tag antibody or 
another type of affinity matrix that binds to the epitope tag. In an alternative embodiment, 
the chimeric molecule may comprise a fusion of a lung cancer polypeptide with an 
immunoglobulin or a particular region of an immunoglobulin. For a bivalent form of the 
chimeric molecule, such a fusion could be to the Fc region of an IgG molecule. 

30 Various tag polypeptides and their respective antibodies are well known and examples 

include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; HIS6 and metal 
chelation tags, the flu HA tag polypeptide and its antibody 12CA5 (Field, et al. (1988) Mol. 
Cell. Biol. 8:2159-2165); the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 9E10 antibodies 
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thereto (Evan, et al. (1985) Molecular and Cellular Biology 5:3610-3616); and the Heipes 

Simplex virus glycoprotein D (gD) tag and its antibody (Paborsky, et al. (1990) Protein 

Engineerine 3(6):547-553). Other tag polypeptides include the Flag-peptide (Hopp, et al. 

(1988) BioTechnologv 6:1204-1210); the KT3 epitope peptide (Martin, et al. (1992) Science 

5 255:192-194); tubulin epitope peptide (Skinner, et al. (1991) J. Biol. Chem. 266:15163- 

15166); and the T7 gene 10 protein peptide tag (Lutz-Freyermuth, et al. (1990) Proc. Nat'l 

Acad. ScL USA 87:6393-6397). 

Also included are other lung cancer proteins of the lung cancer family, and lung 

cancer proteins from other organisms, which are cloned and expressed as outlined below. 

10 Thus, probe or degenerate polymerase chain reaction (PCR) primer sequences may be used to 
find other related lung cancer proteins from primates or other organisms. As will be 
appreciated by those in the art, particularly useful probe and/or PCR primer sequences 
include unique areas of the lung cancer nucleic acid sequence. As is generally known in the 
art, preferred PCR primers are from about 15 to about 35 nucleotides in length, with from 

15 about 20 to about 30 being preferred, and may contain inosine as needed. PCR reaction 
conditions are well known in the art (e.g., Inxiis, PCR Protocols, supra). 

Antibodies to lung cancer proteins 

In a preferred embodiment, when a lung cancer protein is to be used to generate 
20 antibodies, e.g., for immunotherapy or immunodiagnosis, the lung cancer protein should 
share at least one epitope or determinant with the full length protein. By "epitope" or 
"determinant" herein is typically meant a portion of a protein which will generate and/or bind 
an antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies 
made to a smaller lung cancer protein will be able to bind to the full-length protein, 
25 particularly linear epitopes. In a preferred embodiment, the epitope is unique; that is, 
antibodies generated to a unique epitope show little or no cross-reactivity. 

Methods of preparing polyclonal antibodies are well known (e.g., Coligan, supra; and 
Harlow and Lane, supra). Polyclonal antibodies can be raised in a mammal, e.g., by one or 
more injections of an immunizing agent and, if desired, an adjuvant. Typically, the 
30 immunizing agent and/or adjuvant will be injected in the mammal by multiple subcutaneous 
or intraperitoneal injections. The immunizing agent may include a protein encoded by a 
nucleic acid of Tables 1A-16 or fragment thereof or a fusion protein thereof. It may be useful 
to conjugate the immunizing agent to a protein known to be immunogenic in the mammal 
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being immunized. Immunogenic proteins include, e.g., keyhole limpet hemocyanin, serum 

albumin, bovine thyroglobulin, and soybean trypsin inhibitor. Adjuvants include, e.g., 

Freund's complete adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic 

trehalose dicorynomycolate). The immunization protocol may be selected by one skilled in 

5 the art. 

The antibodies may, alternatively, be monoclonal antibodies. Monoclonal antibodies 
may be prepared using hybridoma methods, such as those described by Kohler and Milstein 
(1975) Nature 256:495. In a hybridoma method, a mouse, hamster, or other appropriate host 
animal, is typically immunized with an immunizing agent to elicit lymphocytes that produce 
10 or are capable of producing antibodies that will specifically bind to the immunizing agent. 
Alternatively, the lymphocytes may be immunized in vitro. The immunizing agent will 
typically include a polypeptide encoded by a nucleic acid of the tables, or fragment thereof, 
or a fusion protein thereof. Generally, either peripheral blood lymphocytes ("PBLs") are 
used if cells of human origin are desired, or spleen cells or lymph node cells are used if non- 
15 human mammalian sources are desired. The lymphocytes are then fused with an 

immortalized cell line using a suitable fusing agent, such as polyethylene glycol, to form a 
hybridoma cell (Goding (1986) Monoclonal Antibodies: Principles and Practice, pp. 59-103 ). 
Immortalized cell lines are usually transformed mammalian cells, particularly myeloma cells 
of rodent, bovin, or primate origin. Usually, rat or mouse myeloma cell lines are employed. 
20 The hybridoma cells may be cultured in a suitable culture medium that preferably contains 
one or more substances that inhibit the growth or survival of the unfused, immortalized cells. 
For example, if the parental cells lack the enzyme hypoxanthine guanine phosphoribosyl 
transferase (HGPRT or HPRT), the culture medium for the hybridomas typically will include 
hypoxanthine, aminopterin, and thymidine ("HAT medium"), which substances prevent the 
25 growth of HGPRT-deficient cells, 

In one embodiment, the antibodies are bispecific antibodies. Bispecific antibodies are 
typically monoclonal, preferably human or humanized, antibodies that have binding 
specificities for at least two different antigens or that have binding specificities for two 
epitopes on the same antigen. In one embodiment, one of the binding specificities is for a 
30 protein encoded by a nucleic acid of the tables or a fragment thereof, the other one is for any 
other antigen, and preferably for a cell-surface protein or receptor or receptor subunit, 
preferably one that is tumor specific. Alternatively, tetramer-type technology may create 
multivalent reagents. 
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In a preferred embodiment, the antibodies to lung cancer protein are capable of 

reducing or eliminating a biological function of a lung cancer protein, in a naked form or 

conjugated to an effector moiety. That is, the addition of anti-lung cancer protein antibodies 

(either polyclonal or preferably monoclonal) to lung cancer tissue (or cells containing lung 

5 cancer) may reduce or eliminate the lung cancer. Generally, at least a 25% decrease in 

activity, growth, size or the like is preferred, with at least about 50% being particularly 

preferred and about a 95-100% decrease being especially preferred. 

In a preferred embodiment the antibodies to the lung cancer proteins are humanized 

antibodies (e.g., Xenerex Biosciences, Medarex, Inc., Abgenix, Inc., Protein Design Labs, 

10 Inc.) Humanized forms of non-human (e.g., murine) antibodies are chimeric molecules of 
immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', 
F(ab')2 or other antigen-binding subsequences of antibodies) which contain minimal 
sequence derived from non-human immunoglobulin. Humanized antibodies include human 
immunoglobulins (recipient antibody) in which residues from a complementary determining 

1 5 region (CDR) of the recipient are replaced by residues from a CDR of a non-human species 
(donor antibody) such as mouse, rat or rabbit having the desired specificity, affinity and 
capacity. In some instances, Fv framework residues of a human immunoglobulin are 
replaced by corresponding non-human residues. Humanized antibodies may also comprise 
residues which are found neither in the recipient antibody nor in the imported CDR or 

20 framework sequences. In general, a humanized antibody will comprise substantially all of at 
least one, and typically two, variable domains, in which all or substantially all of the CDR 
regions correspond to those of a non-human immunoglobulin and all or substantially all of 
the framework (FR) regions are those of a human immunoglobulin consensus sequence. A 
humanized antibody optimally also will typically comprise at least a portion of an 

25 immunoglobulin constant region (Fc), typically that of a human immunoglobulin (Jones, et 
al. (1986) Nature 321 :522-525; Riechmann, et al. (1988) Nature 332:323-329; and Presta 
(1992) Curr. On. Struct. Biol. 2:593-596). Humanization can be performed following the 
method of Winter and co-workers (Jones, et al. (1986) Nature 321:522-525; Riechmann, et al. 
(1988) Nature 332:323-327; Verhoeyen, et al. (1988) Science 239:1534-1536), by 

30 substituting rodent CDRs or CDR sequences for corresponding sequences of a human 

antibody. Accordingly, such humanized antibodies are chimeric antibodies (U.S. Patent No. 
4,816,567), wherein substantially less than an intact human variable domain has been 
substituted by corresponding sequence from a non-human species. 
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Human-like antibodies can also be produced using various techniques known in the 

art, including phage display libraries (Hoogenboom and Winter (1991) J. MoL Bio l 227:381; 

Marks, et al. (1991) J. MoL Biol. 222:581). The techniques of Cole, et al. and Boenier, et al. 

are also available for the preparation of human monoclonal antibodies (Cole, et al. (1985) 

5 Monoclonal Antibodies and Cancer Therapy, p. 77 and Boerner, et al. (1991) J. Immunol. 

147(l):86-95). Similarly, human antibodies can be made by introducing human 

immunoglobulin loci into transgenic animals, e.g., mice in which the endogenous 

immunoglobulin genes have been partially or completely inactivated. Upon challenge, 

human antibody production is observed, which closely resembles that seen in humans in 

10 nearly all respects, including gene rearrangement, assembly, and antibody repertoire. This 
approach is described, e.g., in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in the following scientific publications: Marks, et al. (1992) 
Bio/Technology 10:779-783; Lonberg, et al. (1994) Nature 368:856-859; Morrison (1994) 
Nature 368:812-13; Fishwild, et al. (1996) Nature Biotechnology 14:845-51; Neuberger 

15 (1996) Nature Biotechnology 14:826; and Lonberg and Huszar (1995) Intern. Rev. Immunol. 
13:65-93. 

By immunotherapy is meant treatment of lung cancer with an antibody raised against 
a lung cancer proteins. As used herein, immunotherapy can be passive or active. Passive 
immunotherapy as defined herein is the passive transfer of antibody to a recipient (patient). 

20 Active immunization is the induction of antibody and/or T-cell responses in a recipient 
(patient). Induction of an immune response is the result of providing the recipient with an 
antigen to which antibodies are raised. The antigen may be provided by injecting a 
polypeptide against which antibodies are desired to be raised into a recipient, or contacting 
the recipient with a nucleic acid capable of expressing the antigen and under conditions for 

25 expression of the antigen, leading to an immune response; 

In a preferred embodiment the lung cancer proteins against which antibodies are 
raised are secreted proteins as described above. Without being bound by theory, antibodies 
used for treatment, may bind and prevent the secreted protein from binding to its receptor, 
thereby inactivating the secreted lung cancer protein. 

30 In another preferred embodiment, the lung cancer protein to which antibodies are 

raised is a transmembrane protein. Without being bound by theory, antibodies used for 
treatment may bind the extracellular domain of the lung cancer protein and prevent it from 
binding to other proteins, such as circulating ligands or cell-associated molecules. The 



49 



WO 02/086443 PCT/US02/12476 
antibody may cause down-regulation of the transmembrane lung cancer protein. The 

antibody may be a competitive, non-competitive or uncompetitive inhibitor of protein binding 

to the extracellular domain of the lung cancer protein. The antibody may be an antagonist of 

the lung cancer protein or may prevent activation of a transmembrane lung cancer protein, or 

5 may induce or suppress a particular cellular pathway. In some embodiments, when the 

antibody prevents the binding of other molecules to the lung cancer protein, the antibody 

prevents growth of the cell. The antibody may also be used to target or sensitize the cell to 

cytotoxic agents, including, but not limited to TNF-oc, TNF-P, IL-1, INF-y, and IL-2, or 

chemotherapeutic agents including 5FU, vinblastine, actinomycin D, cisplatin, methotrexate, 

10 and the like. In some instances the antibody may belong to a sub-type that activates serum 
complement when complexed with the transmembrane protein thereby mediating cytotoxicity 
or antigen-dependent cytotoxicity (ADCC). Thus, lung cancer may be treated by 
administering to a patient antibodies directed against the transmembrane lung cancer protein. 
Antibody-labeling may activate a co-toxin, localize a toxin payload, or otherwise provide 

15 means to locally ablate cells. 

In another preferred embodiment, the antibody is conjugated to an effector moiety. 
The effector moiety can be various molecules, including labeling moieties such as radioactive 
labels or fluorescent labels, or can be a therapeutic moiety. In one aspect the therapeutic 
moiety is a small molecule that modulates the activity of a lung cancer protein. In another 

20 aspect the therapeutic moiety may modulate an activity of molecules associated with or in 
close proximity to a lung cancer protein. The therapeutic moiety may inhibit enzymatic or 
signaling activity such as protease or collagenase activity associated with lung cancer. 

In a preferred embodiment, the therapeutic moiety can also be a cytotoxic agent. In 
this method, targeting the cytotoxic agent to lung cancer tissue or cells results in a reduction 

25 in the number of afflicted cells, thereby reducing symptoms associated with lung cancer. 

Cytotoxic agents are numerous and varied and include, but are not limited to, cytotoxic drugs 
or toxins or active fragments of such toxins. Suitable toxins and their corresponding 
fragments include diphtheria A chain, exotoxin A chain, ricin A chain, abrin A chain, curcin, 
crotin, phenomycin, enomycin, saporin, auristatin, and the like. Cytotoxic agents also include 

30 radiochemicals made by conjugating radioisotopes to antibodies raised against lung cancer 
proteins, or binding of a radionuclide to a chelating agent that has been covalently attached to. 
the antibody. Targeting the therapeutic moiety to transmembrane lung cancer proteins not 
only serves to increase the local concentration of therapeutic moiety in the lung cancer 
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afflicted area, but also serves to reduce deleterious side effects that may be associated with 

the untargeted therapeutic moiety. 

In another preferred embodiment, the lung cancer protein against which the antibodies 

are raised is an intracellular protein. In this case, the antibody may be conjugated to a protein 

5 or other entity which facilitates entry into the cell. In one case, the antibody enters the cell by 

endocytosis. In another embodiment, a nucleic acid encoding the antibody is administered to 

the individual or cell. Moreover, wherein the lung cancer protein can be targeted within a 

cell, i.e., the nucleus, an antibody theretomay contain a signal for that target localization, i.e., 

a nuclear localization signal. 

1 0 The lung cancer antibodies of the invention specifically bind to lung cancer proteins. 

By "specifically bind" herein is meant that the antibodies bind to the protein with a K<i of at 

least about 0.1 mM, more usually at least about 1 pM, preferably at least about 0.1 pM or 

better, and most preferably, 0.01 pM or better. Selectivity of binding to the specific target 

and not to related other sequences is also important. 

15 

Detection of lung cancer sequence for diagnostic and therapeutic applications 

In one aspect, the RNA expression levels of genes are determined for different 
cellular states in the lung cancer phenotype. Expression levels of genes in normal tissue (e.g., 
not undergoing lung cancer), in lung cancer tissue (and in some cases, for varying severities 

20 of lung cancer that relate to prognosis, as outlined below), or in non-malignant disease are 
evaluated to provide expression profiles. A gene expression profile of a particular cell state 
or point of development is essentially a "fingerprint" of the state of the cell. While two states 
may have a particular gene similarly expressed, the evaluation of a number of genes 
simultaneously allows the generation of a gene expression profile that is reflective of the state 

25 of the cell. By comparing expression profiles of cells in different states, information 

regarding which genes are important (including both up- ahd down-regulation of genes) in 
each of these states is obtained. Then, diagnosis may be performed or confirmed to 
determine whether a tissue sample has the gene expression profile of normal or cancerous 
tissue. This will provide for molecular diagnosis of related conditions. 

30 "Differential expression," or grammatical equivalents as used herein, refers to 

qualitative or quantitative differences in the temporal and/or cellular gene expression 
patterns within and among cells and tissue. Thus, a differentially expressed gene can 
qualitatively have its expression altered, including an activation or inactivation, in, e.g., 
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normal versus lung cancer tissue. Genes may be turned on or turned off in a particular state, 

relative to another state thus permitting comparison of two or more states. A qualitatively 

regulated gene will exhibit an expression pattern within a state or cell type which is 

detectable by standard techniques. Some genes will be expressed in one state or cell type, but 

5 not in both. Alternatively, the difference in expression may be quantitative, e.g., in that 

expression is increased or decreased; i.e., gene expression is either upregulated, resulting in 

an increased amount of transcript, or downregulated, resulting in a decreased amount of 

transcript The degree to which expression differs need only be large enough to quantify via 

standard characterization techniques as outlined below, such as by use of Affymetrix 

10 GeneChip™ expression arrays, Lockhart (1996) Nature Biotechnology 14:1675-1680, hereby 
expressly incorporated by reference. Other techniques include, but are not limited to, 
quantitative reverse transcriptase PCR, northern analysis and RNase protection. As outlined 
above, preferably the change in expression (i.e., upregulation or downregulation) is typically 
at least about 50%, more preferably at least about 100%, more preferably at least about 

15 150%, more preferably at least about 200%, with from 300 to at least 1000% being especially 
preferred. 

Evaluation may be at the gene transcript or the protein level. The amount of gene 
expression may be monitored using nucleic acid probes to the RNA or DNA equivalent of the 
gene transcript, and the quantification of gene expression levels, or, alternatively, the final 

20 gene product itself (protein) can be monitored, e.g., with antibodies to the lung cancer protein 
and standard immunoassays (ELISAs, etc.) or other techniques, including mass spectroscopy 
assays, 2D gel electrophoresis assays, etc. Proteins corresponding to lung cancer genes, e.g., 
those identified as being important in a lung cancer or disease phenotype, can be evaluated in 
a lung cancer diagnostic test. In a preferred embodiment, gene expression monitoring is 

25 performed simultaneously on a number of genes. 

The lung cancer nucleic acid probes may be attached to biochips as outlined herein for 
the detection and quantification of lung cancer sequences in a particular cell. The assays are 
further described below in the example. PCR techniques can be used to provide greater 
sensitivity. Multiple protein expression monitoring can be performed as well. Similarly, 

30 these assays may be performed on an individual basis as well. 

In a preferred embodiment nucleic acids encoding the lung cancer protein are 
detected. Although DNA or RNA encoding the lung cancer protein may be detected, of 
particular interest are methods wherein an mRNA encoding a lung cancer protein is detected. 
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Probes to detect mRNA can be a nucleotide/deoxymicleotide probe that is complementary to 

and hybridizes with the mRNA and includes, but is not limited to, oligonucleotides, cDNA or 

RNA. Probes also should contain a detectable label, as defined herein. In one method the 

mRNA is detected after immobilizing the nucleic acid to be examined on a solid support such 

5 as nylon membranes and hybridizing the probe with the sample. Following washing to 

remove the non-specifically bound probe, the label is detected. In another method detection 

of the mRNA is performed in situ. In this method permeabilized cells or tissue samples are 

contacted with a detectably labeled nucleic acid probe for sufficient time to allow the probe 

to hybridize with the target mRNA. Following washing to remove the non-specifically bound 

10 probe, the label is detected. For example a digoxygenin labeled riboprobe (RNA probe) that 
is complementary to the mRNA encoding a lung cancer protein is detected by binding the 
digoxygenin with an anti-digoxygenin secondary antibody and developed with nitro blue 
tetrazolium and 5-bromo-4-chloro-3-indoyl phosphate. 

In a preferred embodiment, various proteins from the three classes of proteins as 

15 described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic 
assays. The lung cancer proteins, antibodies, nucleic acids, modified proteins and cells 
containing lung cancer sequences are used in diagnostic assays. This can be performed on an 
individual gene or corresponding polypeptide level. In a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 

20 techniques to allow monitoring for expression profile genes and/or corresponding 
polypeptides. 

As described and defined herein, lung cancer proteins, including intracellular, 
transmembrane, or secreted proteins, find use as markers of lung cancer, e.g., for prognostic 
or diagnostic purposes. Detection of these proteins in putative lung cancer tissue allows for 

25 detection, prognosis, or diagnosis of lung cancer or similar disease, and perhaps for selection 
of therapeutic strategy. In one embodiment, antibodies are used to detect lung cancer 
proteins. A preferred method separates proteins from a sample by electrophoresis on a gel 
(typically a denaturing and reducing protein gel, but may be another type of gel, including 
isoelectric focusing gels and the like). Following separation of proteins, the lung cancer 

30 protein is detected, e.g., by immunoblotting with antibodies raised against the lung cancer 
protein. Methods of immunoblotting are well known to those of ordinary skill in the art. 

In another preferred method, antibodies to the lung cancer protein find use in in situ 
imaging techniques, e.g., in histology (e.g., Asai (ed! 1993) Methods in Cell Biology: 



53 



WO 02/086443 PCT/US02/12476 
Antibodies in Cell Biology, volume 37. In this method cells are contacted with from one to 

many antibodies to the lung cancer protein(s). Following washing to remove non-specific 

antibody binding, the presence of the antibody or antibodies is detected. In one embodiment 

the antibody is detected by incubating with a secondary antibody that contains a detectable 

5 label, e.g., multicolor fluorescence or confocal imaging. In another method the primary 

antibody to the lung cancer protein(s) contains a detectable label, e.g., an enzyme marker that 

can act on a substrate. In another preferred embodiment each one of multiple primary 

antibodies contains a distinct and detectable label. This method finds particular use in 

simultaneous screening for a plurality of lung cancer proteins. Many other histological 

1 0 imaging techniques are also provided by the invention. 

In a preferred embodiment the label is detected in a fluorometer which has the ability 

to detect and distinguish emissions of different wavelengths. In addition, a fluorescence 

activated cell sorter (FACS) can be used in the method. 

In another preferred embodiment, antibodies find use in diagnosing lung cancer from 

15 blood, serum, plasma, stool, and other samples. Such samples, therefore, are useful as 

samples to be probed or tested for the presence of lung cancer proteins. Antibodies can be 

used to detect a lung cancer protein by previously described immunoassay techniques 

including ELISA, immunoblotting (western blotting), immunoprecipitation, BIACORE 

technology and the like. Conversely, the presence of antibodies may indicate an immune 

20 response against an endogenous lung cancer protein or vaccine. 

In a preferred embodiment, in situ hybridization of labeled lung cancer nucleic acid 

probes to tissue arrays is done. For example, arrays of tissue samples, including lung cancer 

tissue and/or normal tissue, are made. In situ hybridization (see, e.g., Ausubel, supra) is then 

performed. When comparing the fingerprints between an individual and a standard, the 

25 skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is 

further understood that the genes which indicate the diagnosis may differ from those which 

indicate the prognosis and molecular profiling of the condition of the cells may lead to 

distinctions between responsive or refractory conditions or may be predictive of outcomes. 

In a preferred embodiment, the lung cancer proteins, antibodies, nucleic acids, 

30 modified proteins and cells containing lung cancer sequences are used in prognosis assays. 

As above, gene expression profiles can be generated that correlate to lung cancer, clinical, 

pathological, or other information, in terms of long term prognosis. Again, this may be done 

on either a protein or gene level, with the use of genes being preferred. Single or multiple 
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genes may be useful in various combinations. As above, lung cancer probes may be attached 

to biochips for the detection and quantification of lung cancer sequences in a tissue or patient. 

The assays proceed as outlined above for diagnosis. PCR method may provide more 

sensitive and accurate quantification. 

5 

Assays for therapeutic compounds 

In a preferred embodiment, the proteins, nucleic acids, and antibodies as described 
herein are used in drug screening assays. The lung cancer proteins, antibodies, nucleic acids, 
modified proteins and cells containing lung cancer sequences are used in drug screening 

10 assays or by evaluating the effect of drug candidates on a "gene expression profile" or 

expression profile of polypeptides. In a preferred embodiment, the expression profiles are 
used, preferably in conjunction with high throughput screening techniques to allow 
monitoring for expression profile genes after treatment with a candidate agent (e.g., 
Zlokarnik, et al. (1998) Science 279:84-8; Heid (1996) Genome Res. 6:986-94. 

15 In a preferred embodiment, the lung cancer proteins, antibodies, nucleic acids, 

modified proteins and cells containing the native or modified lung cancer proteins are used in 
screening assays. That is, the present invention provides novel methods for screening for 
compositions which modulate the lung cancer phenotype or an identified physiological 
function of a lung cancer protein. As above, this can be done on an individual gene level or 

20 by evaluating the effect of drug candidates on a "gene expression profile". In a preferred 

embodiment, the expression profiles are used, preferably in conjunction with high throughput 
screening techniques to allow monitoring for expression profile genes after treatment with a 
candidate agent, see Zlokarnik, supra. 

Having identified differentially expressed genes herein, a variety of assays may be 

25 performed. In a preferred embodiment, assays may be run on an individual gene or protein 
level. That is, having identified a particular gene with altered regulation in lung cancer, test 
compounds can be screened for the ability to modulate gene expression or for binding to the 
lung cancer protein. "Modulation" thus includes an increase or a decrease in gene 
expression. The preferred amount of modulation will depend on the original change of the 

30 gene expression in normal versus tissue undergoing lung cancer, with changes of at least 
10%, preferably 50%, more preferably 100-300%, and in some embodiments 300-1000% or 
greater. Thus, if a gene exhibits a 4-fold increase in lung cancer tissue compared to normal 
tissue, a decrease of about four-fold is often desired; similarly, a 10-fold decrease in lung 
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cancer tissue compared to normal tissue often provides a target value of a 10-fold increase in 

expression to be induced by the test compound. 

The amount of gene expression may be monitored using nucleic acid probes and the 

quantification of gene expression levels, or, alternatively, the gene product itself can be 

5 monitored, e.g., through the use of antibodies to the lung cancer protein and standard 

immunoassays. Proteomics and separation techniques may also allow quantification of 

expression. 

In a preferred embodiment, gene or protein expression monitoring of a number of 
entities, i.e., an expression profile, is monitored simultaneously. Such profiles will typically 
1 0 involve a plurality of those entities described herein. 

In this embodiment, the lung cancer nucleic acid probes are attached to biochips as 
outlined herein for the detection and quantification of lung cancer sequences in a particular 
cell. Alternatively, PCR may be used. Thus, a series, e.g., of microtiter plate, may be used 
with dispensed primers in desired wells. A PCR reaction can then be performed and analyzed 
15 for each well. 

Expression monitoring can be performed to identify compounds that modify the 
expression of one or more lung cancer-associated sequences, e.g., a polynucleotide sequence 
set out in the tables. Generally, in a preferred embodiment, a test compound is added to the 
cells prior to analysis. Moreover, screens are also provided to identify agents that modulate 

20 lung cancer, modulate lung cancer proteins, bind to a lung cancer protein, or interfere with 
the binding of a lung cancer protein and an antibody, substrate, or other binding partner. 

The term "test compound" or "drug candidate" or "modulator" or grammatical 
equivalents as used herein describes a molecule, e.g., protein, oligopeptide, small organic 
molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or. 

25 indirectly alter the lung cancer phenotype or the expression of a lung cancer sequence, e.g., a 
nucleic acid or protein sequence. In preferred embodiments, modulators alter expression 
profiles of nucleic acids or proteins provided herein. In one embodiment, the modulator 
suppresses a lung cancer phenotype, e.g., to a normal or non-malignant tissue fingerprint. In 
another embodiment, a modulator induces a lung cancer phenotype. Generally, a plurality of 

30 assay mixtures are run in parallel with different agent concentrations to obtain a differential 
response to the various concentrations. Typically, one of these concentrations serves as a 
negative control, i.e., at zero concentration or below the level of detection. 
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In one aspect, a modulator will neutralize the effect of a lung cancer protein. By 

"neutralize" is meant that activity of a protein and the consequent effect on the cell is 

inhibited or blocked. 

In certain embodiments, combinatorial libraries of potential modulators will be 

5 screened for an ability to bind to a lung cancer polypeptide or to modulate activity. 

Conventionally, new chemical entities with useful properties are generated by identifying a 

chemical compound (called a "lead compound") with some desirable property or activity, 

e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property 

and activity of those variant compounds. Often, high throughput screening (HTS) methods 

10 are employed for such an analysis. 

In one preferred embodiment, high throughput screening methods involve providing a 
library containing a large number of potential therapeutic compounds (candidate 
compounds). Such "combinatorial chemical libraries" are then screened in one or more 
assays to identify those library members (particular chemical species or subclasses) that 

15 display a desired characteristic activity. The compounds thus identified can serve as 

conventional "lead compounds" or can themselves be used as potential or actual therapeutics. 

A combinatorial chemical library is a collection of diverse chemical compounds 
generated by either chemical synthesis or biological synthesis by combining a number of 
chemical "building blocks" such as reagents. For example, a linear combinatorial chemical 

10 library, such as a polypeptide (e.g., mutein) library, is formed by combining a set of chemical 
building blocks called amino acids in every possible way for a given compound length (i.e., 
the number of amino acids in a polypeptide compound). Millions of chemical compounds 
can be synthesized through such combinatorial mixing of chemical building blocks (Gallop, 
et al. (1994) J. Med. Chem. 37(9):1233-1251). 

15 Preparation and screening of combinatorial chemical libraries is well known to those 

of skill in the art. Such combinatorial chemical libraries include, but are not limited to, 
peptide libraries (see, e.g., U.S. Patent No. 5,010,175, Furka (1991) Pept. Prot. Res. 37:487- 
493, Houghton, et aL (1991) Nature. 354:84-88), peptoids (PCT Publication No WO 
91/19735), encoded peptides (PCT Publication WO 93/20242), random bio-oligomers (PCT 

10 Publication WO 92/00091), benzodiazepines (U.S. Pat. No. 5,288,514), diversomers such as 
hydantoins, benzodiazepines and dipeptides (Hobbs, et al. (1993) Proc. Nat. Acad. Sci. USA 
90:6909-6913), vinylogous polypeptides (Hagihara, et al. (1992) J. Amer. Chem. Soc. 
114:6568), nonpeptidal peptidomimetics with a Beta-D-Glucose scaffolding (Hirschmann, et 



57 



WO 02/086443 PCT/US02/12476 
al. (1992) J. Amer. Chem. Soc. 114:9217-9218), analogous organic syntheses of small 

compound libraries (Chen, et al. (1994) J. Amer. Chem. Soc. 116:2661), oligocarbamates 

(Cho, et al (1993) Science 261:1303), and/or peptidyl phosphonates (Campbell, et al. (1994) 

J. Org. Chem. 59:658). See, generally, Gordon, et al. (1994) J. Med. Chem. 37:1385, nucleic 

5 acid libraries (see, e.g., Stratagene, Corp.), peptide nucleic acid libraries (see, e.g., U.S. 

Patent 5,539,083), antibody libraries (see, e.g., Vaughn, et al. (1996) Nature Biotechnology 

14(3):309-314, and PCT/US96/10287), carbohydrate libraries (see, e.g., Liang, et al. (1996) 

Science 274: 1520-1522, and U.S. Patent No. 5,593,853), and small organic molecule libraries 

(see, e.g., benzodiazepines, Baum (1993) C&EN, Jan 18, page 33; isoprenoids, U.S. Patent 

10 No. 5,569,588; thiazolidinones and metathiazanones, U.S. Patent No. 5,549,974; pyrrolidines, 
U.S. Patent Nos. 5,525,735 and 5,519,134; morpholino compounds, U.S. Patent No. 
5,506,337; benzodiazepines, U.S. Patent No. 5,288,514; and the like). . 

Devices for the preparation of combinatorial libraries are commercially available (see, 
e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, Rainin, 

15 Wobum, MA, 433A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, Bedford, 
MA). 

A number of well known robotic systems have also been developed for solution phase 
chemistries. These systems include automated workstations like the automated synthesis 
apparatus developed by Takeda Chemical Industries, LTD. (Osaka, Japan) and many robotic 

20 systems utilizing robotic arms (Zymate II, Zymark Corporation, Hopkinton, Mass.; Orca, 
Hewlett-Packard, Palo Alto, Calif.), which mimic the manual synthetic operations performed 
by a chemist. The above devices, with appropriate modification, are suitable for use with the 
present invention. In addition, numerous combinatorial libraries are themselves 
commercially available (see, e.g., ComGenex, Princeton, N. J., Asinex, Moscow, Ru, Tripos, 

25 Inc., St. Louis, MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek 
Biosciences, Columbia, MD, etc.). 

The assays to identify modulators are amenable to high throughput screening. 
Preferred assays thus detect modulation of lung cancer gene transcription, polypeptide 
expression, and polypeptide activity. 

30 High throughput assays for evaluating the presence, absence, quantification, or other 

properties of particular nucleic acids or protein products are well known to those of skill in 
the art. Similarly, binding assays and reporter gene assays are similarly well known. Thus, 
e.g., U.S. Patent No. 5,559,410 discloses high throughput screening methods for proteins, 
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U.S. Patent No. 5,585,639 discloses high throughput screening methods for nucleic acid 

binding (i.e., in arrays), while U.S. Patent Nos. 5,576,220 and 5,541,061 disclose high 

throughput methods of screening for ligand/antibody binding. 

In addition, high throughput screening systems are commercially available (see, e.g., 

5 Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 

Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These systems 

typically automate procedures, including sample and reagent pipetting, liquid dispensing, 

timed incubations, and final readings of the microplate in detector(s) appropriate for the 

assay. These configurable systems provide high throughput and rapid start up as well as a 

1 0 high degree of flexibility and customization. The manufacturers of such systems provide 
detailed protocols for various high throughput systems. Thus, e.g., Zymark Corp. provides 
technical bulletins describing screening systems for detecting the modulation of gene 
transcription, ligand binding, and the like. 

In one embodiment, modulators are proteins, often naturally occurring proteins or 

15 fragments of naturally occurring proteins. Thus, e.g., cellular extracts containing proteins, or 
random or directed digests of proteinaceous cellular extracts, may be used. In this way 
libraries of proteins may be made for screening in the methods of the invention. Particularly 
preferred in this embodiment are libraries of bacterial, fungal, viral, and mammalian proteins, 
with the latter being preferred, and human proteins being especially preferred. Particularly 

20 useful test compound will be directed to the class of proteins to which the target belongs, e.g., 
substrates for enzymes or ligands and receptors. 

In a preferred embodiment, modulators are peptides of from about 5 to about 30 
amino acids, with from about 5 to about 20 amino acids being preferred, and from about 7 to 
about 15 being particularly preferred. The peptides may be digests of naturally occurring 

25 proteins, random peptides, or <c biased" random peptides. By "randomized" or grammatical 
equivalents herein is meant that the nucleic acid or peptide consists of essentially random 
sequences of nucleotides and amino acids, respectively. Since these random peptides (or 
nucleic acids, discussed below) are often chemically synthesized, they may incorporate a 
nucleotide or amino acid at any position. The synthetic process can be designed to generate 

30 randomized proteins or nucleic acids, to allow the formation of all or most of the possible 

combinations over the length of the sequence, thus forming a library of randomized candidate 
bioactive proteinaceous agents. 
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In one embodiment, the library is fully randomized, with no sequence preferences or 

constants at any position. In a preferred embodiment, the library is biased. That is, some 

positions within the sequence are either held constant, or are selected from a limited number 

of possibilities. In a preferred embodiment, the nucleotides or amino acid residues are 

5 randomized within a defined class, e.g., of hydrophobic amino acids, hydrophilic residues, 

sterically biased (either small or large) residues, towards the creation of nucleic acid binding 

domains, the creation of cysteines, for cross-linking, prolines for SH-3 domains, serines, 

threonines, tyrosines or histidines for phosphorylation sites, etc. 

Modulators of lung cancer can also be nucleic acids, as defined above. 

10 As described above generally for proteins, nucleic acid modulating agents may be 

naturally occurring nucleic acids, random nucleic acids, or "biased" random nucleic acids. 
Digests of procaryotic or eucaryotic genomes may be used as is outlined above for proteins. 

In a preferred embodiment, the candidate compounds are organic chemical moieties, a 
wide variety of which are available in the literature. 

1 5 After a candidate agent has been added and the cells allowed to incubate for some 

period of time, the sample containing a target sequence is analyzed. If required, the target 
sequence is prepared using known techniques. For example, the sample may be treated to 
lyse the cells, using known lysis buffers, electroporation, etc., with purification and/or 
amplification such as PCR performed as appropriate. For example, an in vitro transcription 

20 with labels covalently attached to the nucleotides is performed. Generally, the nucleic acids 
are labeled with biotin-FITC or PE, or with cy3 or cy5. 

In a preferred embodiment, the target sequence is labeled with, e.g., a fluorescent, a 
chemiluminescent, a chemical, or a radioactive signal, to provide a means of detecting the 
target sequence's specific binding to a probe. The label also can be an enzyme, such as, 

25 alkaline phosphatase or horseradish peroxidase, which when provided with an appropriate 
substrate produces a product that can be detected. Alternatively, the label can be a labeled 
compound or small molecule, such as an enzyme inhibitor, that binds but is not catalyzed or 
altered by the enzyme. The label also can be a moiety or compound, such as, an epitope tag 
or biotin which specifically binds to streptavidin. For the example of biotin, the streptavidin 

30 is labeled as described above, thereby, providing a detectable signal for the bound target 
sequence. Unbound labeled streptavidin is typically removed prior to analysis. 

Nucleic acid assays can be direct hybridization assays or can comprise "sandwich 
assays", which include the use of multiple probes, as is generally outlined in U.S. Patent Nos. 
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5,681,702, 5,597,909, 5,545,730, 5,594,117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 

5,591,584, 5,624,802, 5,635,352, 5,594,118, 5,359,100, 5,124,246 and 5,681,697, all of 

which are hereby incorporated by reference. In this embodiment, in general, the target nucleic 

acid is prepared as outlined above, and then added to the biochip comprising a plurality of 

5 nucleic acid probes, under conditions that allow the formation of a hybridization complex. 

A variety of hybridization conditions may be used in the present invention, including 

high, moderate and low stringency conditions as outlined above. The assays are generally 

run under stringency conditions which allow formation of the label probe hybridization 

complex only in the presence of target. Stringency can be controlled by altering a step 

10 parameter that is a thermodynamic variable, including, but not limited to, temperature, 
fonnamide concentration, salt concentration, chaotropic salt concentration, pH, organic 
solvent concentration, etc. 

These parameters may also be used to control non-specific binding, as is generally 
outlined in U.S. Patent No. 5,681,697: Thus it may be desirable to perform certain steps at 

1 5 higher stringency conditions to reduce non-specific binding. 

The reactions outlined herein may be accomplished in a variety of ways. Components 
of the reaction may be added simultaneously, or sequentially, in different orders, with 
preferred embodiments outlined below. In addition, the reaction may include a variety of 
other reagents. These include salts, buffers, neutral proteins, e.g., albumin, detergents, etc. 

20 which may be used to facilitate optimal hybridization and detection, and/or reduce non- 
specific or background interactions. Reagents that otherwise improve the efficiency of the 
assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may also be 
used as appropriate, depending on the sample preparation methods and purity of the target. 
The assay data are analyzed to determine the expression levels, and changes in 

25 expression levels as between states, of individual genes, forming a gene expression profile. 

Screens are performed to identify modulators of the lung cancer phenotype. In one 
embodiment, screening is performed to identify modulators that can induce or suppress a 
particular expression profile, thus preferably generating the associated phenotype. In another 
embodiment, e.g., for diagnostic applications, having identified differentially expressed genes 

30 important in a particular state, screens can be performed to identify modulators that alter 

expression of individual genes. In an another embodiment, screening is performed to identify 
modulators that alter a biological function of the expression product of a differentially 
expressed gene. Again, having identified the importance of a gene in a particular state, 
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screens are performed to identify agents that bind and/or modulate the biological activity of 

the gene product, or evaluate genetic polymorphisms. 

Genes can be screened for those that are induced in response to a candidate agent. 

After identifying a modulator based upon its ability to suppress a lung cancer expression 

5 pattern leading to a normal expression pattern, or to modulate a single lung cancer gene 

expression profile so as to mimic the expression of the gene from normal tissue, a screen as 

described above can be performed to identify genes that are specifically modulated in 

response to the agent. Comparing expression profiles between normal tissue and agent 

treated lung cancer tissue reveals genes that are not expressed in normal tissue or lung cancer 

10 tissue, but are expressed in agent treated tissue. These agent-specific sequences can be 
identified and used by methods described herein for lung cancer genes or proteins. In 
particular these sequences and the proteins they encode find use in marking or identifying 
agent treated cells. La addition, antibodies can be raised against the agent induced proteins 
and used to target novel therapeutics to the treated lung cancer tissue sample. 

15 Thus, in one embodiment, a test compound is administered to a population of lung 

cancer cells, that have an associated lung cancer expression profile. By "administration" or 
"contacting" herein is meant that the candidate agent is added to the cells in such a manner as 
to allow the agent to act upon the cell, whether by uptake and intracellular action, or by 
action at the cell surface. In some embodiments, nucleic acid encoding a proteinaceous 

20 candidate agent (i.e., a peptide) may be put into a viral construct such as an adenoviral or 
retroviral construct, and added to the cell, such that expression of the peptide agent is 
accomplished, e.g., PCT US97/01019: Regulatable gene therapy systems can also be used. 

Once a test compound has been administered to the cells, the cells can be washed if 
desired and are allowed to incubate under preferably physiological conditions for some 

25 period of time. The cells are then harvested and a new gene expression profile is generated, 
as outlined herein. 

Thus, e.g., lung cancer or non-malignant tissue may be screened for agents that 
modulate, e.g., induce or suppress a lung cancer phenotype. A change in at least one gene, 
preferably many, of the expression profile indicates that the agent has an effect on lung 
30 cancer activity. By defining such a signature for the lung cancer phenotype, screens for new 
drugs that alter the phenotype can be devised. With this approach, the drug target need not be 
known and need not be represented in the original expression screening platform, nor does 
the level of transcript for the target protein need to change. 
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Measure of lung cancer polypeptide activity, or of lung cancer or the lung cancer 

phenotype can be performed using a variety of assays. For example, the effects of the test 

compounds upon the function of the metastatic polypeptides can be measured by examining 

parameters described above. A suitable physiological change that affects activity can be used 

5 to assess the influence of a test compound on the polypeptides of this invention. When the 

functional consequences are determined using intact cells or animals, one can also measure a 

variety of effects such as, in the case of lung cancer associated with tumors, tumor growth, 

tumor metastasis, neovascularization, hormone release, transcriptional changes to both known 

and uncharacterized genetic markers (e.g., northern blots), changes in cell metabolism such as 

10 cell growth or pH changes, and changes in intracellular second messengers such as cGMP. In 
the assays of the invention, mammalian lung cancer polypeptide is typically used, e.g., 
mouse, preferably human. 

Assays to identify compounds with modulating activity can be performed in vitro. 
For example, a lung cancer polypeptide is first contacted with a potential modulator and 

15 incubated for a suitable amount of time, e.g., from 0.5 to 48 hours. In one embodiment, the 
lung cancer polypeptide levels are determined in vitro by measuring the level of protein or 
mRNA. The level of protein is typically measured using immunoassays such as western 
blotting, ELIS A and the like with an antibody that selectively binds to the lung cancer 
polypeptide or a fragment thereof. For measurement of mRNA, amplification, e.g., using 

20 PCR, LCR, or hybridization assays, e.g., northern hybridization, RNAse protection, dot 

blotting, are preferred. The level of protein or mRNA is typically detected using directly or 
indirectly labeled detection agents, e.g., fluorescently or radioactively labeled nucleic acids, 
radioactively or enzymatically labeled antibodies, and the like, as described herein. 

Alternatively, a reporter gene system can be devised using a lung cancer protein 

25 promoter operably linked to a reporter gene such as luciferase, green fluorescent protein, 
CAT, or p-gal. The reporter construct is typically transfected into a cell. After treatment 
with a potential modulator, the amount of reporter gene transcription, translation, or activity 
is measured according to standard techniques known to those of skill in the art. 

In a preferred embodiment, as outlined above, screens may be done on individual 

30 genes and gene products (proteins). That is, having identified a particular differentially 

expressed gene as important in a particular state, screening of modulators of the expression of 
the gene or the gene product itself can be done. The gene products of differentially expressed 
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genes are sometimes referred to herein as "lung cancer proteins." The lung cancer protein 

may be a fragment, or alternatively, be the fiill length protein to a fragment shown herein. 

In one embodiment, screening for modulators of expression of specific genes is 
performed. Typically, the expression of only one or a few genes are evaluated. In another 
5 embodiment, screens are designed to first find compounds that bind to differentially 
expressed proteins. These compounds are then evaluated for the ability to modulate 
differentially expressed activity. Moreover, once initial candidate compounds are identified, 
variants can be further screened to better evaluate structure activity relationships. 

In a preferred embodiment, binding assays are done. In general, purified or isolated 
10 gene product is used; that is, the gene products of one or more differentially expressed 

nucleic acids are made. For example, antibodies are generated to the protein gene products, 
and standard immunoassays are run to determine the amount of protein present. Alternatively, 
cells comprising the lung cancer proteins can be used in the assays. 

Thus, in a preferred embodiment, the methods comprise combining a lung cancer 
15 protein and a candidate compound, and determining the binding of the compound to the lung 
cancer protein. Preferred embodiments utilize the human lung cancer protein, although other 
mammalian proteins may also be used, e.g., for the development of animal models of human 
disease. In some embodiments, as outlined herein, variant or derivative lung cancer proteins 
may be used. 

20 Generally, in a preferred embodiment of the methods herein, the lung cancer protein 

or the candidate agent is non-diffusably bound to an insoluble support, preferably having 
isolated sample receiving areas (e.g., a microtiter plate, an array, etc.). The insoluble 
supports may be made of a composition to which the compositions can be bound, is readily 
separated from soluble material, and is otherwise compatible with the overall method of 

25 screening. The surface of such supports may be solid or porous and of a convenient shape. 
Examples of suitable insoluble supports include microtiter plates, arrays, membranes and 
beads. These are typically made of glass, plastic (e.g., polystyrene), polysaccharides, nylon 
or nitrocellulose, teflon™, etc. Microtiter plates and arrays are especially convenient because 
a large number of assays can be carried out simultaneously, using small amounts of reagents 

30 and samples. The particular manner of binding of the composition is typically not crucial so 
long as it is compatible with the reagents and overall methods of the invention, maintains the 
activity of the composition, and is nondiffusable. Preferred methods of binding include the 
use of antibodies (which do not stericaliy block either the ligand binding site or activation 
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sequence when the protein is bound to the support), direct binding to "sticky" or ionic 

supports, chemical crosslinking, the synthesis of the protein or agent on the surface, etc. 
Following binding of the protein or agent, excess unbound material is removed by washing. 
The sample receiving areas may then be blocked through incubation with bovine serum 
5 albumin (BSA), casein or other innocuous protein or other moiety. 

In a preferred embodiment, the lung cancer protein is bound to the support, and a test 
compound is added to the assay. Alternatively, the candidate agent is bound to the support 
and the lung cancer protein is added. Novel binding agents include specific antibodies, non- 
natural binding agents identified in screens of chemical libraries, peptide analogs, etc. Of 

10 particular interest are screening assays for agents that have a low toxicity for human cells. A 
wide variety of assays may be used for this purpose, including labeled in vitro protein-protein 
binding assays, electrophoretic mobility shift assays, immunoassays for protein binding, 
functional assays (phosphorylation assays, etc.) and the like. 

The determination of the binding of the test modulating compound to the lung cancer 

1 5 protein may be done in a number of ways. In a preferred embodiment, the compound is 

labeled, and binding determined directly, e.g., by attaching all or a portion of the lung cancer 
protein to a solid support, adding a labeled candidate agent (e.g., a fluorescent label), washing 
off excess reagent, and determining whether the label is present on the solid support. Various 
blocking and washing steps may be utilized as appropriate. 

20 In some embodiments, only one of the components is labeled, e.g., the proteins (or 

proteinaceous candidate compounds) can be labeled. Alternatively, more than one 
component can be labeled with different labels, e.g., 125 I for the proteins and a fluorophor for 
the compound. Proximity reagents, e.g., quenching or energy transfer reagents are also 
useful. 

25 In one embodiment, the binding of the test compound is determined by competitive 

binding assay. The competitor may be a binding moiety known to bind to the target molecule 
(i.e., a lung cancer protein), such as an antibody, peptide, binding partner, ligand, etc. Under 
certain circumstances, there may be competitive binding between the compound and the 
binding moiety, with the binding moiety displacing the compound. In one embodiment, the 

30 test compound is labeled. Either the compound, or the competitor, or both, is added first to 
the protein for a time sufficient to allow binding, if present. Incubations may be performed at 
a temperature which facilitates optimal activity, typically between 4 and 40° C. Incubation 
periods are typically optimized, e.g., to facilitate rapid high throughput screening. Typically 
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between 0.1 and 1 hour will be sufficient. Excess reagent is generally removed or washed 

away. The second component is then added, and the presence or absence of the labeled 

component is followed, to indicate binding. 

In a preferred embodiment, the competitor is added first, followed by a test 

5 compound. Displacement of the competitor is an indication that the test compound is binding 

to the lung cancer protein and thus is capable of binding to, and potentially modulating, the 

activity of the lung cancer protein. In this embodiment, either component can be labeled. 

Thus, e.g., if the competitor is labeled, the presence of label in the wash solution indicates 

displacement by the agent. Alternatively, if the test compound is labeled, the presence of the 

1 0 label on the support indicates displacement. 

In an alternative embodiment, the test compound is added first, with incubation and 
washing, followed by the competitor. The absence of binding by the competitor may indicate 
that the test compound is bound to the lung cancer protein with a higher affinity. Thus, if the 
test compound is labeled, the presence of the label on the support, coupled with a lack of * 

15 competitor binding, may indicate that the test compound is capable of binding to the lung 
cancer protein. 

In a preferred embodiment, the methods comprise differential screening to identity 
agents that are capable of modulating the activity of the lung cancer proteins. In one 
embodiment, the methods comprise combining a lung cancer protein and a competitor in a 

20 first sample. A second sample comprises a test compound, a lung cancer protein, and a 

competitor. The binding of the competitor is determined for both samples, and a change, or 
difference in binding between the two samples indicates the presence of an agent capable of 
binding to the lung cancer protein and potentially modulating its activity. That is, if the 
binding of the competitor is different in the second sample relative to the first sample, the 

25 agent is capable of binding to the lung cancer protein. 

Alternatively, differential screening is used to identify drug candidates that bind to the 
native lung cancer protein, but cannot bind to modified lung cancer proteins. The structure of 
the lung cancer protein may be modeled, and used in rational drug design to synthesize agents 
that interact with that site. Drug candidates that affect the activity of a lung cancer protein 

30 are also identified by screening drugs for the ability to either enhance or reduce the activity of 
the protein. 

Positive controls and negative controls may be used in the assays. Preferably control 
and test samples are performed in at least triplicate to obtain statistically significant results. 
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Incubation of all samples is for a time sufficient for the binding of the agent to the protein. 

Following incubation, samples are washed free of non-specifically bound material and the 

amount of bound, generally labeled agent determined. For example, where a radiolabel is 

employed, the samples may be counted in a scintillation counter to determine the amount of 

5 bound compound. 

A variety of other reagents may be included in the screening assays. These include 
reagents like salts, neutral proteins, e.g., albumin, detergents, etc. which may be used to 
facilitate optimal protein-protein binding and/or reduce non-specific or background 
interactions. Also reagents that otherwise improve the efficiency of the assay, such as 
10 protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used. The mixture 
of components may be added in an order that provides for the requisite binding. 

In a preferred embodiment, the invention provides methods for screening for a 
compound capable of modulating the activity of a lung cancer protein. The methods 
comprise adding a test compound, as defined above, to a cell comprising lung cancer 
15 proteins. Preferred cell types include almost any cell. The cells contain a recombinant 
nucleic acid that encodes a lung cancer protein. In a preferred embodiment, a library of 
candidate agents are tested on a plurality of cells. 

In one aspect, the assays are evaluated in the presence or absence or previous or 
subsequent exposure of physiological signals, e.g., hormones, antibodies, peptides, antigens, 
20 cytokines, growth factors, action potentials, pharmacological agents including 

chemotherapeutics, radiation, carcinogenics, or other cells (e.g., cell-cell contacts). In another 
example, the determinations are determined at different stages of the cell cycle process. 

In this way, compounds that modulate lung cancer agents are identified. Compounds 
with pharmacological activity are able to enhance or interfere with the activity of the lung 
25 cancer protein. Once identified, similar structures are evaluated to identify critical structural 
feature of the compound. 

In one embodiment, a method of inhibiting lung cancer cell division is provided. The 
method comprises administration of a lung cancer inhibitor. In another embodiment, a 
method of inhibiting lung cancer is provided. The method may comprise administration of a 
30 lung cancer inhibitor. In a further embodiment, methods of treating cells or individuals with 
lung cancer are provided, e.g., comprising administration of a lung cancer inhibitor. 

In one embodiment, a lung cancer inhibitor is an antibody as discussed above. In 
another embodiment, the lung cancer inhibitor is an antisense molecule. 
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A variety of cell growth, proliferation, viability, and metastasis assays are known to 

those of skill in the art, as described below. 

Soft agar growth or colony formation in suspension 

Normal cells require a solid substrate to attach and grow. When the cells are 
transformed, they lose this phenotype and grow detached from the substrate. For example, 
transformed cells can grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft agar. The transformed cells, when transfected with tumor 
suppressor genes, regenerate normal phenotype and require a solid substrate to attach and 
grow. Soft agar growth or colony formation in suspension assays can be used to identify 
modulators of lung cancer sequences, which when expressed in host cells, inhibit abnormal 
cellular proliferation and transformation. A therapeutic compound would reduce or eliminate 
the host cells' ability to grow in stirred suspension culture or suspended in semi-solid media, 
such as semi-solid or soft. 

Techniques for soft agar growth or colony formation in suspension assays are 
described in Freshney (1994) Culture of Animal Cells a Manual of Basic Technique (3 rd ed.), 
herein incorporated by reference. See also, the methods section of Garkavtsev, et al. (1996), 
supra, herein incorporated by reference. 

Contact inhibition and density limitation of growth 

Normal cells typically grow in a flat and organized pattern in a petri dish until they 
touch other cells. When the cells touch one another, they are contact inhibited and stop 
growing. When cells are transformed, however, the cells are not contact inhibited and 
continue to grow to high densities in disorganized foci. Thus, the transformed cells grow to a 
higher saturation density than normal cells. This can be detected morphologically by the 
formation of a disoriented monolayer of cells or rounded cells in foci within the regular 
pattern of normal surrounding cells. Alternatively, labeling index with ( 3 H)-thymidine at 
saturation density can be used to measure density limitation of growth. See Freshney (1994), 
supra. The transformed cells, when transfected with tumor suppressor genes, regenerate a 
normal phenotype and become contact inhibited and would grow to a lower density. 

In this assay, labeling index with ( 3 H)-thymidine at saturation density is a preferred 
method of measuring density limitation of growth. Transformed host cells are transfected 
with a lung cancer-associated sequence and are grown for 24 hours at saturation density in 
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non-limiting medium conditions. The percentage of cells labeling with ( 3 H)-thymidine is 

determined autoradiographically. See, Freshney (1994), supra. 



Growth factor or serum dependence 

Transformed cells typically have a lower serum dependence than their normal 
counterparts (see, e.g., Temin (1966) J. Natl. Cancer Insti. 37:167-175; Eagle, et al. (1970) L 
Exp. Med. 131:836-879); Freshney, supra. This is in part due to release of various growth 
factors by the transformed cells. Growth factor or serum dependence of transformed host 
cells can be compared with that of control. 



Tumor specific markers levels 

Tumor cells release an increased amount of certain factors (hereinafter "tumor 
specific markers") than their normal counterparts. For example, plasminogen activator (PA) 
is released from human glioma at a higher level than from normal brain cells (see, e.g., 

15 Gullino, "Angiogenesis, tumor vascularization, and potential interference with tumor growth" 
in Mihich (ed. 1985) Biological Responses in Cancer, pp. 178-184). Similarly, Tumor 
angiogenesis factor (TAF) is released at a higher level in tumor cells than their normal 
counterparts. See, e.g., Folkman (1992) "Angiogenesis and Cancer" in S em Cancer BioLY 
Various techniques which measure the release of these factors are described in 

20 Freshney (1994), supra. Also, see, Unkeless, et al. (1974) J. Biol. Chem. 249:4295-4305; 
Strickland and Beers (1976) J. Biol. Chem . 251:5694-5702; Whur, et al. (1980) Br. J. Cancer 
42:305-312; Gullino, "Angiogenesis, tumor vascularization, and potential interference with 
tumor growth" in Mihich (ed. 1985) Biological Responses in Cancer, pp. 178-184; Freshney 
Anticancer Res. 5:1 1 1-130 (1985). 

25 

Invasiveness into Matrigel 

The degree of invasiveness into Matrigel or some other extracellular matrix 
constituent can be used as an assay to identify compounds that modulate lung cancer- 
associated sequences. Tumor cells exhibit a good correlation between malignancy and 
30 invasiveness of cells into Matrigel or some other extracellular matrix constituent. In this 
assay, tumorigenic cells are typically used as host cells. Expression of a tumor suppressor 
gene in these host cells would decrease invasiveness of the host cells. 
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Techniques described in Freshney (1994), supra, can be used. Briefly, the level of 

invasion of host cells can be measured by using filters coated with Matrigel or some other 

extracellular matrix constituent Penetration into the gel, or through to the distal side of the 

filter, is rated as invasiveness, and rated histologically by number of cells and distance 

5 moved, or by prelabeling the cells with 125 I and counting the radioactivity on the distal side of 

the filter or bottom of the dish. See, e.g., Freshney (1984), supra. 



Tumor growth in vivo 

Effects of lung cancer-associated sequences on cell growth can be tested in transgenic 

10 or immune-suppressed mice. Knock-out transgenic mice can be made, in which the lung 
cancer gene is disrupted or in which a lung cancer gene is inserted. Knock-out transgenic 
mice can be made by insertion of a marker gene or other heterologous gene into the 
endogenous lung cancer gene site in the mouse genome via homologous recombination. 
Such mice can also be made by substituting the endogenous lung cancer gene with a mutated 

15 version of the lung cancer gene, or by mutating the endogenous lung cancer gene, e.g., by 
exposure to carcinogens. 

A DNA construct is introduced into the nuclei of embryonic stem cells. Cells 
containing the newly engineered genetic lesion are injected into a host mouse embryo, which 
is re-implanted into a recipient female. Some of these embryos develop into chimeric mice 

20 that possess germ cells partially derived from the mutant cell line. Therefore, by breeding the 
chimeric mice it is possible to obtain a new line of mice containing the introduced genetic 
lesion (see, e.g., Capecchi, et al. (1989) Science 244:1288). Chimeric targeted mice can be 
derived according to Hogan, et al (1988) Manipulating the Mouse Embryo: A Laboratory 
Manual Cold Spring Harbor Laboratory and Robertson (ed. 1987) Teratocarcinomas and 

25 Embryonic Stem Cells: A Practical Approach, , IRL Press, Washington, D.C 

Alternatively, various immune-suppressed or immune-deficient host animals can be 
used. For example, genetically athymic "nude" mouse (see, e.g., Giovanella, et al. (1974) J. 
Natl. Cancer Inst. 52:921), a SCID mouse, a thymectomized mouse, or an irradiated mouse 
(see, e.g., Bradley, et al. (1978) Br. J. Cancer 38:263; Selby, et al. (1980) Br. J. Cancer 41 :52) 

30 can be used as a host. Transplantable tumor cells (typically about 10 6 cells) injected into 

isogenic hosts will produce invasive tumors in a high proportions of cases, while normal cells 
of similar origin will not. In hosts which developed invasive tumors, cells expressing a lung 
cancer-associated sequences are injected subcutaneously. After a suitable length of time, 
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preferably 4-8 weeks, tumor growth is measured (e.g., by volume or by its two largest 

dimensions) and compared to the control. Tumors that have statistically significant reduction 

(using, e.g., Student's T test) are said to have inhibited growth. 



5 Polynucleotide modulators of lung cancer 

Antisense and RNAi Polynucleotides 

In certain embodiments, the activity of a lung cancer-associated protein is 
downregulated, or entirely inhibited, by the use of antisense or an inhibitory polynucleotide, 
i.e., a nucleic acid complementary to, and which can preferably hybridize specifically to, a 

10 coding mRNA nucleic acid sequence, e.g., a lung cancer protein mRNA, or a subsequence 
thereof. Binding of the antisense polynucleotide to the mRNA reduces the translation and/or 
stability of the mRNA. 

In the context of this invention, antisense polynucleotides can comprise naturally- 
occurring nucleotides, or synthetic species formed from naturally-occurring subunits or their 

1 5 close homologs. Antisense polynucleotides may also have altered sugar moieties or inter- 
sugar linkages. Exemplary among these are the phosphorothioate and other sulfur containing 
species which are known for use in the art. Analogs are comprehended by this invention so 
long as they function effectively to hybridize with the lung cancer protein mRNA. See, e.g., 
Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA. 

20 Such antisense polynucleotides can readily be synthesized using recombinant means, 

or can be synthesized in vitro. Equipment for such synthesis is sold by several vendors, 
including Applied Biosystems. The preparation of other oligonucleotides such as 
phosphorothioates and alkylated derivatives is also well known to those of skill in the art. 
Antisense molecules as used herein include antisense or sense oligonucleotides. 

25 Sense oligonucleotides can, e.g., be employed to block transcription by binding to the anti- 
sense strand. The antisense and sense oligonucleotide comprise a single-stranded nucleic 
acid sequence (either RNA or DNA) capable of binding to target mRNA (sense) or DNA 
(antisense) sequences for lung cancer molecules. A preferred antisense molecule is for a lung 
cancer sequence in the tables, or for a ligand or activator thereof. Antisense or sense 

30 oligonucleotides, according to the present invention, comprise a fragment generally at least 
about 14 nucleotides, preferably from about 14 to 30 nucleotides. The ability to derive an 
antisense or a sense oligonucleotide, based upon a cDNA sequence encoding a given protein 
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is described in, e.g., Stein and Cohen (1988) Cancer Res. 48:2659 and van der Krol, et al. 

(1988) BioTechniques 6:958). 

RNA interference is a mechanism to suppress gene expression in a sequence specific 

manner. See, e.g., Brumelkamp, et al. (20021 Sciencexpress (21March2002); Sharp (1999) 

5 Genes Dev. 13:139-141; and Cathew (2001) Curr. Op. Cell Biol. 13:244-248. In mammalian 

cells, short, e.g., 21 nt, double stranded small interfering RNAs (siRNA) have been shown to 

be effective at inducing an RNAi response. See, e.g., Elbashir, et al. (2001) Nature 411:494- 

498. The mechanism may be used to downregulate expression levels of identified genes, e.g., 

treatment of or validation of relevance to disease. 

10 

Ribozymes 

In addition to antisense polynucleotides, ribozymes can be used to target and inhibit 
transcription of lung cancer-associated nucleotide sequences. A ribozyme is an RNA 
molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes have 

15 been described, including group I ribozymes, hammerhead ribozymes, hairpin ribozymes, 
RNase P, and axhead ribozymes (see, e.g., Castanotto, et al. (1994) Adv. in Pharmacology 
25: 289-317 for a general review of the properties of different ribozymes). 

The general features of hairpin ribozymes are described, e.g., in Hampel, et al. (1990) 
Nucl. Acids Res. 1 8:299-304; European Patent Publication No. 0 360 257; U.S. Patent No. 

20 5,254,678. Methods of preparing are well known to those of skill in the art (see, e.g., WO 
94/26877; Ojwang, et al. (1993) Proc. Natl. Acad. Sci. USA 90:6340-6344; Yamada, et al. 
(1994) Human Gene Therapy 1:39-45; Leavitt, et al. (1995) Proc. Natl. Ac ad. Sci. USA 
92:699-703; Leavitt, et al. (19994) Human Gene Therapy 5:1 151-120; and Yamada, et al. 
(1994) Virology 205: 121-126). 

25 Polynucleotide modulators of lung cancer may be introduced into a cell containing the 

target nucleotide sequence by formation of a conjugate with a ligand binding molecule, as 
described in WO 91/04753. Suitable ligand binding molecules include, but are not limited to, 
cell surface receptors, growth factors, other cytokines, or other ligands that bind to cell 
surface receptors. Preferably, conjugation of the ligand binding molecule does not 

30 substantially interfere with the ability of the ligand binding molecule to bind to its 

corresponding molecule or receptor, or block entry of the sense or antisense oligonucleotide 
or its conjugated version into the cell. Alternatively, a polynucleotide modulator of lung 
cancer may be introduced into a cell containing the target nucleic acid sequence, e.g., by 
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formation of an polynucleotide-lipid complex, as described in WO 90/10448. It is 

understood that the use of antisense molecules or knock out and knock in models may also be 

used in screening assays as discussed above, in addition to methods of treatment. 

Thus, in one embodiment, methods of modulating lung cancer in cells or organisms 

5 are provided. In one embodiment, the methods comprise administering to a cell an anti-lung 

cancer antibody that reduces or eliminates the biological activity of an endogenous lung 

cancer protein. Alternatively, the methods comprise administering to a cell or organism a 

recombinant nucleic acid encoding a lung cancer protein. This may be accomplished in any 

number of ways. In a preferred embodiment, e.g., when the lung cancer sequence is down- 

10 regulated in lung cancer, such state may be reversed by increasing the amount of lung cancer 
gene product in the cell. This can be accomplished, e.g., by overexpressing the endogenous 
lung cancer gene or administering a gene encoding the lung cancer sequence, using known 
gene-therapy techniques. In a preferred embodiment, the gene therapy techniques include the 
incorporation of the exogenous gene using enhanced homologous recombination (EHR), e.g., 

15 as described in PCT/US93/03868, hereby incorporated by reference in its entirety. 

Alternatively, e.g., when the lung cancer sequence is up-regulated in lung cancer, the activity 
of the endogenous lung cancer gene is decreased, e.g., by the administration of a lung cancer ■ 
antisense or RNAi nucleic acid. 

In one embodiment, the lung cancer proteins of the present invention may be used to 

20 generate polyclonal and monoclonal antibodies to lung cancer proteins. Similarly, the lung 
cancer proteins can be coupled, using standard technology, to affinity chromatography 
columns. These columns may then be used to purify lung cancer antibodies useful for 
production, diagnostic, or therapeutic purposes. In a preferred embodiment, the antibodies 
are generated to epitopes unique to a lung cancer protein; that is, the antibodies show little or 

25 no cross-reactivity to other proteins. The lung cancer antibodies may be coupled to standard 
affinity chromatography columns and used to purify lung cancer proteins. The antibodies 
may also be used as blocking polypeptides, as outlined above, since they will specifically 
bind to the lung cancer protein. 

30 Methods of identifying variant lung cancer-associated sequences 

Without being bound by theory, expression of various lung cancer sequences is 
correlated with lung cancer. Accordingly, disorders based on mutant or variant lung cancer 
genes may be determined. In one embodiment, the invention provides methods for 
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identifying cells containing variant lung cancer genes, e.g., determining all or part of the 

sequence of at least one endogenous lung cancer genes in a cell. In a preferred embodiment, 

the invention provides methods of identifying the lung cancer genotype of an individual, e.g., 

determining all or part of the sequence of at least one lung cancer gene of the individual. 

5 This is generally done in at least one tissue of the individual, and may include the evaluation 

of a number of tissues or different samples of the same tissue. The method may include 

comparing the sequence of the sequenced lung cancer gene to a known lung cancer gene, i.e., 

a wild-type gene. 

The sequence of all or part of the lung cancer gene can then be compared to the 
10 sequence of a known lung cancer gene to determine if any differences exist. This can be 
done using known homology programs, such as Bestfit, etc. In a preferred embodiment, the 
presence of a difference in the sequence between the lung cancer gene of the patient and the 
known lung cancer gene correlates with a disease state or a propensity for a disease state, as 
outlined herein. 

15 In a preferred embodiment, the lung cancer genes are used as probes to determine the 

number of copies of the lung cancer gene in the genome. 

In another preferred embodiment, the lung cancer genes are used as probes to 

determine the chromosomal localization of the lung cancer genes. Information such as 

chromosomal localization finds use in providing a diagnosis or prognosis in particular when 
20 chromosomal abnormalities such as translocations, and the like are identified in the lung 

cancer gene locus. 

Administration of pharmaceutical and vaccine compositions 

In one embodiment, a therapeutically effective dose of a lung cancer protein or 
25 modulator thereof, is administered to a patient. By "therapeutically effective dose" herein is 
meant a dose that produces effects for which it is administered. The exact dose will depend 
on the purpose of the treatment, and will be ascertainable by one skilled in the art using 
known techniques (e.g., Ansel, et al. (1992) Pharmaceutical Dosage Forms and Drug 
Delivery: Lieberman, Pharmaceutical Dosage Forms (vols. 1-3), Dekker, ISBN 0824770846, 
30 082476918X, 0824712692, 0824716981; Lloyd (1999) The Art, Science and Technology of 
Pharmaceutical Compounding : and Pickar (1999) Dosage Calculations^) . Adjustments for 
lung cancer degradation, systemic versus localized delivery, and rate of new protease 
synthesis, as well as the age, body weight, general health, sex, diet, time of administration, 
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drug interaction and the severity of the condition may be necessary, and will be ascertainable 

with routine experimentation by those skilled in the art. 

A "patient" for the purposes of the present invention includes both humans and other 

animals, particularly mammals. Thus the methods are applicable to both human therapy and 

5 veterinary applications. In the preferred embodiment the patient is a mammal, preferably a 

primate, and in the most preferred embodiment the patient is human. 

The administration of the lung cancer proteins and modulators thereof of the present 

invention can be done in a variety of ways, including, but not limited to, orally, 

subcutaneously, intravenously, intranasally, transdermally, intraperitoneally, intramuscularly, 

10 intrapulmonary, vaginally, rectally, or intraocularly. In some instances, e.g., in the treatment 
of wounds and inflammation, the lung cancer proteins and modulators may be directly 
applied as a solution or spray. 

The pharmaceutical compositions of the present invention comprise a lung cancer 
protein in a form suitable for administration to a patient. In the preferred embodiment, the 

15 pharmaceutical compositions are in a water soluble form, such as being present as 

pharmaceutical!/ acceptable salts, which is meant to include both acid and base addition 
salts. 'Thannaceutically acceptable acid addition salt" refers to those salts that retain the 
biological effectiveness of the free bases and that are not biologically or otherwise 
undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, 

20 sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 
propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic 
acid, fumaric acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, 
methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, salicylic acid and the like. 
'Tharmaceutically acceptable base addition salts" include those derived from inorganic bases 

25 such as sodium, potassium, lithium, ammonium, calcium, magnesium, iron, zinc, copper, 
manganese, aluminum salts and the like. Particularly preferred are the ammonium, 
potassium, sodium, calcium, and magnesium salts. Salts derived from pharmaceuticaily 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 
substituted amines including naturally occurring substituted amines, cyclic amines and basic 

30 ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, 
tripropylamine, and ethanolamine. 

The pharmaceutical compositions may also include one or more of the following: 
carrier proteins such as serum albumin; buffers; fillers such as microcrystalline cellulose, 
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lactose, com and other starches; binding agents; sweeteners and other flavoring agents; 

coloring agents; and polyethylene glycol. 

The pharmaceutical compositions can be administered in a variety of unit dosage 

forms depending upon the method of administration. For example, unit dosage forms 

5 suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules 

and lozenges. It is recognized that lung cancer protein modulators (e.g., antibodies, antisense 

constructs, ribozymes, small organic molecules, etc.) when administered orally, should be 

protected from digestion. This is typically accomplished either by complexing the 

molecule(s) with a composition to render it resistant to acidic and enzymatic hydrolysis, or by 

10 packaging the molecule(s) in an appropriately resistant carrier, such as a liposome or a 
protection barrier. Means of protecting agents from digestion are well known in the art. 

The compositions for administration will commonly comprise a lung cancer protein 
modulator dissolved in a pharmaceutical^ acceptable carrier, preferably an aqueous carrier. 
A variety of aqueous carriers can be used, e.g., buffered saline and the like. These solutions 

15 are sterile and generally free of undesirable matter. These compositions may be sterilized by 
conventional, well known sterilization techniques. The compositions may contain 
phannaceutically acceptable auxiliary substances as required to approximate physiological 
conditions such as pH adjusting and buffering agents, toxicity adjusting agents and the like, 
e.g., sodium acetate, sodium chloride, potassium chloride, calcium chloride, sodium lactate 

20 and the like. The concentration of active agent in these formulations can vary widely, and 
will be selected primarily based on fluid volumes, viscosities, body weight and the like in 
accordance with the particular mode of administration selected and the patient's needs (e.g., 
Remington's Pharmaceutical Science (15th ed., 1980) and Hardman, et al. (eds. 1996) 
Goodman and Gilman: The Pharmacologial Basis of Therapeutics) . 

25 Thus, a typical pharmaceutical composition for intravenous administration would be 

about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per patient per 
day may be used, particularly when the drug is administered to a secluded site and not into 
the blood stream, such as into a body cavity or into a lumen of an organ. Substantially higher 
dosages are possible in topical administration. Actual methods for preparing parenterally 

30 administrable compositions will be known or apparent to those skilled in the art, e.g., 

Remington's Pharmaceutical Science and Goodman and Gilman, The Pharmacologial Basis 
ofTheraneutics. supra. 
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The compositions containing modulators of lung cancer proteins can be administered 

for therapeutic or prophylactic treatments. In therapeutic applications, compositions are 

administered to a patient suffering from a disease (e.g., a cancer) in an amount sufficient to 

cure or at least partially arrest the disease and its complications. An amount adequate to 

5 accomplish this is defined as a "therapeutically effective dose." Amounts effective for this 

use will depend upon the severity of the disease and the general state of the patient's health. 

Single or multiple administrations of the compositions may be administered depending on the 

dosage and frequency as required and tolerated by the patient. In any event, the composition 

should provide a sufficient quantity of the agents of this invention to effectively treat the 

10 patient. An amount of modulator that is capable of preventing or slowing the development of 
cancer in a mammal is referred to as a "prophylactically effective dose." The particular dose 
required for a prophylactic treatment will depend upon the medical condition and history of 
the mammal, the particular cancer being prevented, as well as other factors such as age, 
weight, gender, administration route, efficiency, etc. Such prophylactic treatments may be 

1 5 used, e.g., in a mammal who has previously had cancer to prevent a recurrence of the cancer, 
or in a mammal who is suspected of having a significant likelihood of developing cancer 
based, at least in part, upon gene expression profiles. Vaccine strategies may be used, in 
either a DNA vaccine form, or protein vaccine. 

It will be appreciated that the present lung cancer protein-modulating compounds can 

20 be administered alone or in combination with additional lung cancer modulating compounds 
or with other therapeutic agent, e.g., other anti-cancer agents or treatments. 

In numerous embodiments, one or more nucleic acids, e.g., polynucleotides 
comprising nucleic acid sequences set forth in the tables, such as antisense or RNAi 
polynucleotides or ribozymes, will be introduced into cells, in vitro or in vivo. The present 

25 invention provides methods, reagents, vectors, and cells useful for expression of lung cancer- 
associated polypeptides and nucleic acids using in vitro (cell-free), ex vivo, or in vivo (cell or 
organism-based) recombinant expression systems. 

The particular procedure used to introduce the nucleic acids into a host cell for 
expression of a protein or nucleic acid is application specific. Many procedures for 

30 introducing foreign nucleotide sequences into host cells may be used. These include the use 
of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, 
plasma vectors, viral vectors and other well known methods for introducing cloned genomic 
DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell (see, e.g., 
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Berger and Kimmel, Guide to Molecular Cloning Techniques. Methods in Enzvmology 

volume 152 (Berger), Ausubel, et al. (eds. 1999) Current Protocols (supplemented through 

1999), and Sambrook, et al. (1989) Molecular Cloning - A Laboratory Manual (2nd ed., Vol. 

1-3). 

5 In a preferred embodiment, lung cancer proteins and modulators are administered as 

therapeutic agents, and can be formulated as outlined above. Similarly, lung cancer genes 
(including both the full-length sequence, partial sequences, or regulatory sequences of the 
lung cancer coding regions) can be a<lministered in a gene therapy application. These lung 
cancer genes can include antisense or inhibitory applications, e.g., as inhibitory RNA or gene 

10 therapy (e.g., for incorporation into the genome) or as antisense compositions. 

Lung cancer polypeptides and polynucleotides can also be administered as vaccine 
compositions to stimulate HTL, CTL, and antibody responses.. Such vaccine compositions 
can include, e.g., lipidated peptides (see, e.g.,Vitiello, et al. (1995) J. Clin. Invest. 95:341), 
peptide compositions encapsulated in poly(DL-lactide-co-glycolide) ("PLG") microspheres 

15 (see, e.g., Eldridge, et al. (1991) Molec. Immunol. 28:287-294; Alonso, et al. (1994) Vaccine 
12:299-306; Jones, et al. (1995) Vaccine 13:675-681), peptide compositions contained in 
immune stimulating complexes (ISCOMS) (see, e.g., Takahashi, et al. (1990) Nature 
344:873-875; Hu, et al. (1998) Clin Exp Immunol. 113:235-243), multiple antigen peptide 
systems (MAPs) (see, e.g., Tarn (1988) Proc. Natl. Acad. Sci. U.S.A. 85:5409-5413; Tarn 

20 (1996) J. Immunol. Methods 196: 17-32), peptides formulated as multivalent peptides; 

peptides for use in ballistic delivery systems, typically crystallized peptides, viral delivery 
vectors (Perkus, et al., p. 379 In: Kaufmann (ed. 1996) Concepts in vaccine development: 
Chakrabarti, et al. (1986) Nature 320:535; Hu, et al. (1986) Nature 320:537; Kieny, et al. 
(1986) AIDS Bio/Technology 4:790; Top, et al. (1971) J. Infect. Pis. 124: 148; Chanda, et al. 

25 (1990) Virology 175:535), particles of viral or synthetic origin (see, e.g., Kofler, et al. (1996) 
J. Immunol. Methods 192:25; Eldridge, et al. (1993) Sem. Hematol. 30:16; Falo, et al. (1995) 
Nature Med. 7:649), adjuvants (Warren, et al. (1986) Annu. Rev. Immunol. 4:369; Gupta, et 
al. (1993) Vaccine 1 1 :293), liposomes (Reddy, et al. (1992) J. Immunol. 148:1585; Rock 
(1996) Immunol. Today 17:131), or, naked or particle absorbed cDNA (Ulmer, et al. (1993) 

30 Science 259:1745; Robinson, et al. (1993) Vaccine 1 1:957; Shiver, et al., p. 423 In: 

Kaufmann (ed. 1996) Concepts in vaccine development: Cease and Berzofsky (1994) Annu. 
Rev. Immunol. 12:923 and Eldridge, et al. (1993) Sem. Hematol. 30:16). Toxin-targeted 
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delivery technologies, also known as receptor mediated targeting, such as those of Avant 

Immunotherapeutics, Inc. (Needham, Massachusetts) may also be used. 

Vaccine compositions often include adjuvants. Many adjuvants contain a substance 

designed to protect the antigen from rapid catabolism, such as aluminum hydroxide or 

5 mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or 

Mycobacterium tuberculosis derived proteins. Certain adjuvants are commercially available 

as, e.g., Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, Detroit, 

MI); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ); AS-2 (SmithKline 

Beecham, Philadelphia, PA); aluminum salts such as aluminum hydroxide gel (alum) or 

1 0 aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of acylated 

tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; 

polyphosphazenes; biodegradable microspheres; monophosphoryl lipid A and quil A. 

Cytokines, such as GM-CSF, interleukin-2, -7, -12, and other like growth factors, may also be 

used as adjuvants. 

15 Vaccines can be administered as nucleic acid compositions wherein DNA or RNA 

encoding one or more of the polypeptides, or a fragment thereof, is administered to a patient. 
This approach is described, for instance, in Wolff, et. al. (1990) Science 247:1465 as well as 
U.S. Patent Nos. 5,580,859; 5,589,466; 5,804,566; 5,739,118; 5,736,524; 5,679,647; WO 
98/04720; and in more detail below. Examples of DNA-based delivery technologies include 

20 "naked DNA", facilitated (bupivicaine, polymers, peptide-mediated) delivery, cationic lipid 
complexes, and particle-mediated ("gene gun") or pressure-mediated delivery (see, e.g., U.S. 
Patent No. 5,922,687). 

For therapeutic or prophylactic immunization purposes, the peptides of the invention 
can be expressed by viral or bacterial vectors. Examples of expression vectors include 

25 attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of 
vaccinia virus, e.g., as a vector to express nucleotide sequences that encode lung cancer 
polypeptides or polypeptide fragments. Upon introduction into a host, the recombinant 
vaccinia virus expresses the immunogenic peptide, and thereby elicits an immune response. 
Vaccinia vectors and methods useful in immunization protocols are described in, e.g., U.S. 

30 Patent No. 4,722,848. Another vector is BCG (Bacille Calmette Guerin). BCG vectors are 
described in Stover, et al. (1991) Nature 351:456-460. A wide variety of other vectors useful 
for therapeutic administration or immunization e.g., adeno and adeno-associated virus 
vectors, retroviral vectors, Salmonella typhi vectors, detoxified anthrax toxin vectors, and the 
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like, will be apparent to those skilled in the art from the description herein (see, e.g., Shata, et 

al. (2000) Mol Med Today 6:66-71; Shedlock, et al. (2000) J. Leukoc. Biol. 68:793-806; 

Hipp, et al. (2000) In Vivo 14:571-85). 

Methods for the use of genes as DNA vaccines are well known, and include placing a 

5 lung cancer gene or portion of a lung cancer gene under the control of a regulatable promoter 

or a tissue-specific promoter for expression in a lung cancer patient The lung cancer gene 

used for DNA vaccines can encode full-length lung cancer proteins, but more preferably 

encodes portions of the lung cancer proteins including peptides derived from the lung cancer 

protein. In one embodiment, a patient is immunized with a DNA vaccine comprising a 

10 plurality of nucleotide sequences derived from a lung cancer gene. For example, lung cancer- 
associated genes or sequence encoding subfragments of a lung cancer protein are introduced 
into expression vectors and tested for their immunogenicity in the context of Class I MHC 
and an ability to generate cytotoxic T cell responses. This procedure provides for production 
of cytotoxic T cell responses against cells which present antigen, including intracellular 

15 epitopes. 

In a preferred embodiment, DNA vaccines include a gene encoding an adjuvant 
molecule with the DNA vaccine. Such adjuvant molecules include cytokines that increase 
the immunogenic response to the lung cancer polypeptide encoded by the DNA vaccine. 
Additional or alternative adjuvants are available. 

20 In another preferred embodiment lung cancer genes find use in generating animal 

models of lung cancer. When the lung cancer gene identified is repressed or diminished in 
metastatic tissue, gene therapy technology, e.g., wherein antisense or inhibitory RNA directed 
to the lung cancer gene will also diminish or repress expression of the gene. Animal models 
of lung cancer find use in screening for modulators of a lung cancer-associated sequence or 

25 modulators of lung cancer. Similarly, transgenic animal technology including gene knockout 
technology, e.g., as a result of homologous recombination with an appropriate gene targeting 
vector, will result in the absence or increased expression of the lung cancer protein. When 
desired, tissue-specific expression or knockout of the lung cancer protein may be necessary. 
It is also possible that the lung cancer protein is overexpressed in lung cancer. As 

30 such, transgenic animals can be generated that overexpress the lung cancer protein. 

Depending on the desired expression level, promoters of various strengths can be employed 
to express the transgene. Also, the number of copies of the integrated transgene can be 
determined and compared for a determination of the expression level of the transgene. 
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Animals generated by such methods will find use as animal models of lung cancer and are 

additionally useful in screening for modulators to treat lung cancer. 



Kits for Use in Diagnostic and/or Prognostic Applications 

5 For use in diagnostic, research, and therapeutic applications suggested above, kits are 

also provided by the invention. In diagnostic and research applications such kits may include 
at least one of the following: assay reagents, buffers, lung cancer-specific nucleic acids or 
antibodies, hybridization probes and/or primers, antisense polynucleotides, ribozymes, RNAi, 
dominant negative lung cancer polypeptides or polynucleotides, small molecule inhibitors of 

10 lung cancer-associated sequences, etc. A therapeutic product may include sterile saline or 
another pharmaceutically acceptable emulsion and suspension base. 

In addition, the kits may include instructional materials containing instructions (e.g., 
protocols) for the practice of the methods of this invention. While the instructional materials 
typically comprise written or printed materials they are not limited to such. A medium 

15 capable of storing such instructions and communicating them to an end user is contemplated 
by this invention. Such media include, but are not limited to electronic storage media (e.g., 
magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the like. Such 
media may include addresses to internet sites that provide such instructional materials. 
The present invention also provides for kits for screening for modulators of lung 

20 cancer-associated sequences. Such kits can be prepared from readily available materials and 
reagents. For example, such kits can comprise one or more of the following materials: a lung 
cancer-associated polypeptide or polynucleotide, reaction tubes, and instructions for testing 
lung cancer-associated activity. Optionally, the kit contains biologically active lung cancer 
protein. A wide variety of kits and components can be prepared according to the present 

25 invention, depending upon the intended user of the kit and the particular needs of the user. 
Diagnosis would typically involve evaluation of a plurality of genes or products. The genes 
typically will be selected based on correlations with important parameters in disease which 
may be identified in historical or outcome data. 
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Example 1: Gene Chip Analysis 

Molecular profiles of various normal and cancerous tissues were determined and 
5 analyzed using gene chips. RNA was isolated and gene chip analysis was performed as 

described (Glynne, et al. (2000) Nature 403:672-676; Zhao, et al. (2000) Genes Dev. 14:981- 
993). 
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Tables 1A and 1B ware previously filed on April 18, 2001 in USSN 60/284,770 (18501-001500US) and on November 29, 2001 in USSN 60/334,370 
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Hs.141883 


ESTs 


0.75 


0.27 


104857 


AA043219 


Hs.19058 


ESTs 


2.6 


3.3 


104865 


AA045136 


Hs.22575 


ESTs 


1.23 


0.49 


104989 


AA102098 


Hs.1 18615 


ESTs 


0.63 


A VI 

0.32 


105729 


AA292694 


Hs.3807 


ESTs; Weakly slnularto PHOSPHOLEMMAN PR 


0.86 


0.34 


105847 


AA398608 


Hs.32241 


ESTs 


1.32 


0.4 


105894 


AA400979 


Hs.25691 


calcitonin receptor-like receptor activi 


0.78 


0.28 


106490 


AA451861 


Hs.1 15537 


ESTs; Weakly similar to dipeptidase prec 


1.2 


0.47 


106536 


AA453997 


Hs.23804 


ESTs 


0.82 


0.15 


106605 


AA457718 


Hs.21103 


Homo sapiens mRNA; cDNA DKFZp564B076 (fr 


0.99 


0.07 


106667 


AA461086 


Hs.16578 


ESTs 


1.17 


0.4 


106773 


AA478109 


Hs.1 88833 


ESTs 


1.46 


0.43 


106797 


AA478962 


Hs.1 69943 


ESTs 


1.18 


0.32 


106844 


AA485055 


Hs.1 5621 3 


sperm associated antigen 6 


0.98 


0.51 


106870 


AA487576 


Hs.26530 


serum deprivation response (phosphatidyl 


1.05 


0.14 


106954 


AA496980 


Hs.204038 


ESTs 


1.25 


0.33 


107054 


AA600150 


Hs.1 4366 


ESTs 


4 44 

1.11 


ft A 

0.4 


107292 


T30407 


Hs.4789 


ESTs; Weakly similar to oxldative-stress 


1.07 


2.58 


107994 


M036811 


Hs.165030 


ESTs 


0.7 


0.21 


107997 


AA037388 


Hs.82223 


Human DNA sequence from clone 141H5 on c 


1.02 


0.48 


108041 


AA041552 


Hs.61957 


ESTs 


1.44 


0.51 


108087 


AA045709 


Hs,40545 


ESTs 


1.98 


1 


108362 


AA074885 


Hs.67726 


macrophage receptor with collagenous sir 


1.52 


0.72 


108435 


AA078787 


Hs.194101 


ESTs 


2.53 


1.53 


108480 


AA081093 


Hs.68055 


ESTs 


1.56 


0.48 


109252 


AA194830 


Hs.85944 


ESTs 


2.69 


3.18 


109550 


F01534 


Hs.26981 


ESTs 


1.19 


0.65 


109613 


F03031 


Hs.27519 


ESTs 


1.01 


0.29 


109837 


H00656 


Hs.29792 


ESTs 


0.81 


0.15 


109893 


K04768 


Hs.30484 


ESTs 


1.44 


0.32 


109984 


K09594 


Hs.10299 


ESTs 


0.62 


0.14 


110099 


H16568 


Hs.23748 


ESTs 


1.01 


0.28 


110837 


N30796 


Hs.17424 


ESTs; Weakly similar to semaphore F [H. 


1.1 


0.22 


111247 


N69825 


Hs.16762 


Homo sapiens mRNA; cDNA DKFZp564B2062 (f 


1.26 


0.26 


111341 


N80935 


Hs.22483 


ESTs 


1.57 


0.52 


111510 


R07856 


Hs.16355 


ESTs 


3.96 


1 


111737 


R25410 


Hs.9218 


ESTs 


0.97 


0.24 


113195 


T57112 




-"yc20g1 1*1 Stratagene lung (#937210) 


1.22 


0.35 


113238 


T62979 


Hs.1 89813 


ESTs 


2.27 


0.45 


113540 


T90496 


Hs.1 6757 


ESTs 


1.06 


0.22 


113552 


T90889 


Hs.16026 


ESTs 


1.16 


0.42 


113606 


T93093 


Hs.1 7125 


ESTs 


1.48 


0.7 


113695 


T96965 


Hs.17948 


ESTs 


1.54 


0.28 


113946 


W84753 


Hs.37896 


ESTs 


1.79 


0.72 


114251 


Z39898 


Hs.21948 


ESTs 


1.95 


0.25 


114359 


Z41589 


Hs.153483 


ESTs; Moderately similar to H1 chloride 


1.42 


0.13 


115230 


AA278300 


Hs.182980 


ESTs 


2.62 


0.42 


115279 


AA279760 


Hs.63671 


ESTs 


1.79 


0.91 


115566 


AA398083 


Hs.43977 


ESTs 


0.86 


0.2 


115965 


AA446661 


Hs.173233 


ESTs 


0.79 


0.04 


116166 


AA461556 


Hs.202949 


KIAA1 102 protein 


2.29 


0.68 


116279 


AA486073 


Hs.57362 


ESTs 


Z27 


0.78 


117023 


H88157 


Hs.41105 


ESTs 


1.36 


0.16 
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117209 


H99959 


Hs.42788 


ESTs 


1.46 


0.48 


118901 


N90719 


Hs.94445" 


ESTs 


1.51 


1 


118981 


' N93839 


Hs.39238 


ESTs 


1.34 


0.48 


119073 


R32894 


Hs.45514 


v-eb avian erythroblastosis vims E26 o 


1.14 


0.27 


119221 


R98105 




*""yr30g1 1.s1 Scares fetal Ever spleen 


1.32 


0.53 


119824 


W74535 


Hs.1 84 


advanced giycosyta&on end product-sped 


1 


0.19 


119861 


W80715 




ESTs; Moderately similar to Ml ALU SUB 


1.83 


0.45 


120041 


W92775 


Hs.59358 


ESTs 


1.23 


0.55 


120132 


Z38839 


Hs.125019 


ESTs; Highly similar to KIAA0885 protein 


0.91 


0.37 


120467 


AA251579 


Hs.187628 


ESTs 


1.87 


1.91 


121314 


AA402799 


Hs.182538 


ESTs 


1.3 


0.31 


121643 


AA417078 


Hs.193767 


ESTs 


2.31 


0.68 


121690 


AA418074 


Hs.1 10286 


ESTs 


1.47 


0.51 


122633 


AA454080 


Hs.34853 


inhibitor of DNA binding 4; dominant neg 


1.31 


0.63 


123978 


C20853 


Hs.170278 


ESTs 


1.52 


0.32 


124214 


H58608 


Hs.151323 


ESTs 


0.93 


0.35 


124357 


N22401 




~"yw37g07.s1 Morton Fetal Cochlea Homo 


1.29 


1 


124438 


N40188 


Hs.102550 


ESTs 


1.36 


0.7 


125167 


W45560 


Hs.102541 


ESTs 


1.46 


0.69 


125174 


W51835 


Hs.231082 


EST 


3.07 . 


3,76 


125422 


AA903229 


Hs.1 53717 


ESTs 


1.34 


0.3 


125561 


AI417667 


Hs.22978 


ESTs 


1.89 


0.63 


125831 


D60988 




"""HUM145B09B Clontech human fetal brain 


0.94 


0.36 


127002 


R35380 


Hs-24979 


ESTs 


3.02 


4.06 


127307 


AA369367 


Hs.1 26712 


ESTs; Weakly similar to p!L2 hypothetic 


1.01 


0.69 


127609 


AA622559 


Hs.150318 


ESTs 


1.21 


0.32 


127959 


AI302471 


Hs.1 24292 


ESTs 


2.5 


1 


128458 


D52193 


Hs.55340 


ESTs 


1.13 


0.33 


128624 


AA479209 


Hs.102647 


ESTs 


1.45 


0.58 


128789 


AA486567 


Hs.1 05695 


ESTs 


1.1 


0.34 


128798 


AF014958 


Hs.1 05938 


chemokine (C-C motif) receptor-like 2 


1.16 


0.55 


128952 


R51076 


Hs.107361 


ESTs; Highly similar to Rap2 interacting 


2.04 


2.4 


129057 


X62466 


Hs.214742 


CDW52 antigen (CAMPATH-1 antigen) 


1.77 


0.73 


129210 


AA401654 


Hs.202949 


KIAA1 102 protein 


1.11 


0.36 


129240 


W24360 


Hs.237868 


InterleuWn 7 receptor 


0.91 


0.41 


129402 


T637B1 




"""yc21g0U1 Stratagene lung (#937210) 


1.36 


0.43 


129565 


X77777 


Hs.198726 


vasoactive intestinal peptide receptor 1 


0.67 


0.08 


129593 


AA487015 


Hs.98314 


Homo sapiens mRNA; cDNA DKF2p566L0120 (f 


1.3 


0.42 


129626 


AA447410 


Hs.1 1712 


ESTs; Weakly similar to 1111 ALU SUBFAM1 


1.28 


0.46 


129699 


AA458576 


Hs.1 201 7 


KIAA0439 protein; homolog of yeast ubiqu 


1.58 


1 


129898 


N48595 


Hs.13256 


ESTs 


1.13 


0.53 


129958 


L20591 


Hs.1378 


annexlnA3 


0.81 


0.31 


130273 


U59914 


Hs.1 53863 


MAD (mothers against decapentaplegic; Dr 


0.59 


0.22 


130655 


N92934 


Hs.17409 


cysteine-rteh protein 1 (intestinal) 


1.44 


0.76 


130657 


T94452 


Hs.201591 


ESTs 


0.96 


0.42 


131061 


N64328 


Hs.22567 


ESTs; Moderately similar to HYPOTHETICAL 


1.51 


0.45 


131066 


F09006 


Hs.22588 


ESTs 


0.97 


0.37 


131263 


R38334 


Hs.24950 


regulator of G-protein signalling 5 


2.34 


2.82 


131589 


U521Q0 


Hs.29191 


epithelial membrane protein 2 


1.2 


0.62 


131686 


AA157428 


Hs.30687 


Grb2-assoclated binder 2 


0.95 


0.38 


131751 


H18335 


Hs.31562 


ESTs 


1.47 


0.52 


132430 


T23630 


Hs.258675 


EST 


1.86 


2.09 


132476 


N67192 


Hs.49476 


Homo sapiens clone TUA8 CrWu-chat regl 


1.73 


0.58 


132836 


F09557 


Hs.57929 


slit(Drosophiia) homolog 3 


0.91 


0.29 


133120 


X64559 


Hs.65424 


tetranectin (plasminogen-binding protein 


0.82 


0.2 


133488 


D45370 


Hs.74120 


adipose specific 2 


1.29 


0.48 


133565 


H57056 


Hs.204831 


ESTs 


Z25 


0.57 


133651 


U97105 


Hs.1 73381 


dihydropyrimidinase-like 2 


1.65 


0.62 


133835 


M059489 


Hs.76640 


ESTs; Highly similar to RGC32 [Rnorveg 


1.16 


0.34 


133978 


W73859 


Hs,780B1 


transcription factor 21 


0.79 


0.27 


133985 


L34657 


Hs.78146 


platetet/endoMal cell adhesion molec 


0.99 


0.28 


134299 


AA487558 


Hs.8135 


ESTs 


1.02 


0.46 


134300 


U81984 


Hs.1 66082 


endothelial PAS domain protein I 


0.86 


0.42 


134323 


AA028976 


Hs.8175 


Homo sapiens mRNA; cDNA DKFZp564M0763 (f 


1.19 


0.27 


134343 


D50883 


Hs.82028 


transforming growth factor, beta recepto 


1.21 


0.67 


134417 


D87969 


Hs.82921 


solute carrier family 35 (CMP-siafic act 


1.28 


1 


134561 


U76421 


Hs.85302 


adenosine deaminase; RNA-spectflc; B1 (h 


2.12 


0.55 


134624 


W67147 


Hs.8700 


deleted in liver cancer 1 


2.35 


2.74 


134696 


H88354 


Hs.8861 


ESTs 


1.35 


0.33 


134749 


L10955 


Hs.89485 


carbonic anhydrase IV 


0.89 


0.2 


134786 


L06139 


Hs.89640 


TEK tyrosine kinase; endothelial (venous 


0.48 


0.21 


134869 


T3528B 


Hs.90421 


ESTs; Moderately simitar to Uli ALU SUB 


2.14 


2.64 


135346 


M21056 


Hs.992 


phosphoiipase A2; group IB (pancreas) 


0.63 


0.13 


100113 


D00591 


Hs.84746 


Chromosome condensation 1 


1 


215 


100147 


013666 


Hs.1 36348 


Homo sapiens mRNA for osteoblast specif! 


0.5 


2 


100280 


D42085 


Hs.155314 


K1AA0095 gene product 


1.02 


1.39 


100335 


D63391 


Hs.6793 


platetet-actrVating factor acetythydrola 


1 


5.58 


100360 


D78335 


Hs.75939 


Uridine monophosphate kinase 


0.91 


2.04 


100372 


079997 


Hs.184339 


KlAA0175gene product 


0.75 


2.03 


100486 


HG1112-HT1112 


TIGR: ras-fike protein TC4 
"collagen, type VII, alpha V 


1.09 


1.93 


100559 


HG2197-HT2267 


0.97 


3.6 


lUUo/o 


HG22904TT2386 


caiciiCTun/aipna-wjKr, «l iranscnpi 




\ 


100668 


HG2981-HT3938 


"TIGR: C044 (epican,alL transcript 12 


0.85 


1.9 


100906 


HG471&-KT5158 


Guanosine 5 i -Monophosphate Synthase 


1.18 


2.29 


100930 


HG721-HT4827 




"TIGR: placental protein 14, endometrial 


1 


1.45 
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100950 


J00124 


Hs.1177» 


keratin 14 (epidermolysis bullosa simpte 


0.84 


101031 


J05070 


Hs.151738 


'Matrix metaltoproteinase 9 (geiaSnase 


0.77 


101111 


108424 


Hs.1619 


Achaete-scute complex (Drosophfla) homo! 


1 


101124 


L10343 


Hs.112341 


'Protease Inhibitor 3, sJdrKJerived (SKA 


0.62 


101175 


L18920 


Hs.36980 


'Melanoma antigen, family A, 2" 


1 


101204 


124203 


Hs.82237 


Ataxia-telangiectasia group Dissociated 


0.74 


101431 


M19888 


Hs.1076 


Small proline-rich protein 1B (comifin) 


0.85 


101448 


M21389 


Hs.195850 


keratin 5 (epidermolysis bullosa simplex 


. 0.61 


101511 


M27826 


Hs.267319 


Endogenous retroviral protease 


1.03 


101526 


M29540 


Hs.220529 


Caranoembryonic antigen-related cell ad 
'Guanine nucleotide binding protein (G p 


1.07 


101548 


M31328 


Hs.71642 


0.97 


101625 


M57293 




'Human parathyroid hormone-related pepti 


1 


101649 


M60047 


Hs.1690 


Heparin-binding growth factor binding pr 


1 


101724 


M69225 


Hs.620 


bullous pemphigoid antigen 1 (23Q/240kD) 


1 


101748 


M76482 


Hs.1925 


Desmoglein 3 (pemphigus vulgaris antigen 
'Solute carrier family 7 (cationfc amino 


1 


101759 


M80244 


Hs.184601 


1.07 


101804 


M86699 


Hs.169840 


TTK protein kinase 


1 


101606 


M86757 


Hs.112408 


S100 calcium-binding protein A7 (psorias 


0.74 


101809 


M86849 




"Homo sapiens connexin 26 (GJB2) mRNA, c 
•Protein tyrosine phosphatase, receptor- 


1 


101845 


M93426 


Hs.78867 


1 


101851 


M94250 


Hs.82045 


Midkine (neurits growth-promoting factor 


1.13 


102083 


U10323 


Hs.75117 


"Interieukin enhancer binding factor % 


1.03 


102154 


umso 


Hs.75517 


'iamnh, beta 3 (nfcein (125kD) t ka)inJ 


0.94 


102193 


U20758 


Hs.313 


secreted phosphoprotein 1 (osieoponBn; 


0.34 


102305 


U33286 


Hs.90073 


chromosome segregation 1 (yeast homoiog) 


1.45 


102348 


U37519 


Hs.87539 


Aldehyde dehydrogenase 8 


0.52 


102581 


U61145 


Hs.77256 


Enhancer of zeste (DrosophDa) homoiog 2 


0.91 


102610 


U65011 


Hs.30743 


Preferentially expressed antigen in meia 


t 


102623 


U66083 


Hs.37110 


'Melanoma antigen, family A, 9 (MAGE-9)" 


1 


102669 


U71207 


Hs.29279 


Eyes absent (Drosophila) homoiog 2 


1 


102696 


U74612 


Hs.239 


Forkhead box M1 


1.06 


102829 


U91618 


Hs.80962 


Neurotensin 


1 


102888 


X04741 


Hs.76118 


UbiquiCn carboxyMerminal esterase L1 


1.13 


102913 


X07696 


Hs.80342 


keratin 15 


0.7 


102915 


X07820 


Hs.2258 


Matrix Metalioproteinase 10 (Stromolysin 


1.15 


102963 


X15943 


Hs.37058 


"Calcitonln/caldtonirwelated polypepti 


1 


103021 


X53587 


Hs.85268 


'Integrin.beM* 


1.38 


103036 


X54925 


Hs.83169 


Matrix metalioprotease 1 (interstitial c 


1 


103058 


X57348 


Hs.184510 


Stratlfin 


1.25 


103060 


X57766 


Hs.155324 


matrix metalioproteinase 1 1 (stromelysin 
"Cadherin 3, P-cadherin (placental)' 


1 


103119 


X63629 


Hs.2877 


1.16 


103206 


X72755 


Hs.77367 


monokine induced by gamma interferon 


0.71 


103242 


X76342 


Hs.389 


'Alcohol dehydrogenase 7 (class IV), mu 
"Lymphocyte antigen 6 complex, locus D; 


1 


103312 


X82693 


Hs.3185 


0.92 


103478 


Y07755 


Hs.38991 


S100 calcium-binding protein A2 


1.05 


103558 


219574 


Hs.2785 


keratin 17 


0.65 


103576 


Z26317 


Hs.2631 


Desmoglein 2 


0.79 


103587 


Z29083 , 


Hs.82128 


5T4 Oncofetal antigen 


1 


103594 


Z31560 


Hs.816 


"SRY (sex determining region Y)-box 2, p 


0.71 


103768 


AA089997 




'ESTs, Highly similar to integral membra 


0.99 


104158 


AA454908 


Hs.8127 


WAA01 44 gen8 product 


0.96 


104558 


R5S678 


Hs.88959 


Human DNA sequence from clone 967N21 on 


1.23 


104689 


AA010665 




ESTs 


0.96 


104733 


M019498 


Hs.23071 


ESTs 


1.18 


104906 


AA055809 


Hs.26802 


Protein kinase domains containing protel 


1.11 


104978 


AA088458 


Hs.19322 


' ESTs; Weakly similar to 1!!! ALU SUBFAMI 


1.64 


105012 


AA116036 


Hs.9329 


"Homo sapiens mRNA for fls353, complete 


1.19 


105175 


AA186804 


Hs.25740 


ESTs; Weakly similar to unknown (S.cerev 


0.9 


105263 


AA227926 


Hs.6682 


ESTs 


0.95 


105298 


AA233459 


Hs.26369 


ESTs 


1 


105312 


AA233854 


Hs.23348 


S-phase kinase-associated protein 2 (p45 


1.32 


105719 


AA291644 


Hs.36793 


Hypothetical protein FU23188 


1.28 


105743 


AA293300 


Hs.9598 


ESTs 


1 


106012 - 


AA411621 


Hs.8895 


ESTs; sameasBFH6? 


0.94 


106231 


AA429571 


Hs.38002 


K1AA1355 protein 


1.04 


106540 


AA454607 


Hs.38114 


Hypothetical protein FU11100 


1.26 


106575 


AA456039 


Hs.105421 


ESTs 


1 


106632 


AA459897 


Hs.11950 


GPI-anchored metastasis-associated prate 


0.67 


106727 


AA465342 


Hs.34045 


Hypothetical protein FU20764 


0.67 


106906 


AA490237 


Hs.222024 


Transcription factor BMAL2 (cycle-like f 


0.61 


107059 


AA608545 


Hs.23044 


RAD51 (S. cerevisiae) homoiog (E coli Re 


0.48 


107104 


AA609786 


Hs.15243 


Nucleolar protein 1 (120kO) 


1.01 


107151 


AA621169 


Hs.8687 


ESTs; procollagen l-N proteinase 


0.97 


107284 


S74039 


Hs.291904 


Accessory proteins 8AP31/BAP29 


1.15 


107901 


AA026418 


Hs.91539 


ESTs 


0.72 


107922 


AA028028 


Ks.61460 


Ig superfamfly receptor LNIR precursor 


1 


107932 


AA029317 


Hs.18878 


Hypothetical protein FU21620 


1 


108695 


M121315 


Hs.70823 


K1AA1077 protein 


0.91 


108857 


AA133250 


Hs.62180 


ESTs 


1 


108860 


AA1 33334 


Hs.1 29911 


ESTs 


0.73 


108990 


AA152296 


Hs.72045 


ESTs 


1 


109166 


AA179845 


Hs.73625 


"RAB6 interacting, Idnesin-ifoe (rabWne 


1 


109424 


AA227919 


Hs.85962 


Hyaluronan synthase 3 


1 


109665 


F05012 


Hs.27027 


Hypothetical protein DKFZp762H1311 


1.42 


109970 


H09281 


Hs.13234 


ESTs 


1.13 
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1.88 

3.15 
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4,63 

287 

1.13 

3.01 

231 

1 
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132771 


AA488432 


Hs. 56407 


phosphoserine phosphatase 


132833 


U78525 


Hs, 57783 


eukaryotic translation initiation factor 


132922 


T23641 


Hs.6066 


KIAA1 112 protein 


132959 


AA028103 


Hs.61472 


ESTs; WeaMy similar to unknown [S.cerev 


132994 


AA505133 


Ha.7594 


sotuts carrier famiy 2 (facilitated giu 


133005 


C21400 


Hs.103329 


K1M0970 protein 


133065 


X62535 


Hs.172690 


dlacyf glycerol kinase; alpha (80kD) 


133083 


N70633 


Hs.6456 


chaperonin containing TCP1; subunit 2 (b 


133086 


L17131 


Hs.1 39800 


high-mobility group (nonhistone chromoso 


133134 


TB9703 


Hs.65648 


RWA binding mofif protein 8 


133195 


AA350744 


Hs.181409 


WAA1007 protein 


133313 


AA249427 


Hs.70704 


ESTs 


133331 


T62039 


HS.15B675 


ribosomal protein L14 


133438 


D13370 


Hs.73722 


APEX nuclease (multifunctional DNA repal 


133445 


T99303 


Hs.73797 


guanine nucieoQde binding protein (G pr 


133483 


X52426 


Hs.74070 


keratin 13 


133492 


L40397 


Hs.74137 


transmembrane trafficking protein 


133504 


W95070 


Hs.74316 


desmopJakin (UP); OPII) 


133517 


X52947 


Hs.74471 


gap junction protein; alpha 1; 43kD (con 


133540 


D78151 


Hs.74619 


proteasome (prosome; macropain) 26S subu 


133594 


107758 


Hs.1 72589 


nuclear phosphoprotein simflar to S. car 


133627 


U09587 


Hs.75280 


glycyURNA synthetase 


133671 


T25747 


Hs.75471 


zinc finger protein 146 


133859 


U86782 


Hs.1 78761 


26S proteasome-associated padl homolog 


133865 


F09315 


Hs.170290 


discs; targe (Drosophfla) homolog 5 


133913 


W84712 


Hs.7753 


calumenin 


133963 


L34587 


Hs.1 84693 


transcription elongation factor B (Slit) . 


133982 


U47621 


Hs.207251 


nucleolar autoantigen (55kD) similar to 


134100 


L07540 


Hs.1 71075 


replication factor C (activator 1) 5 (36 


134110 


U41060 


Hs.79136 


UV-1 protein; estrogen regulated 


134158 


U15174 


Hs.79428 


BCL2/adenovirus E1B 19kD-interacting pro 


134161 


U97188 


Hs.79440 


IGF-II mRNA-binding protein 3 


134193 


F09570 


Hs.7980 


ESTs 


134367 


X54199 


Hs.82285 


phosphoribosytglydnamide formyt transfer 


134402 


U25165 


Hs.82712 


fragile X mental retardation; autosomal 


134457 


D86963 


Hs.1 74044 


dishevelled 3 (homologous to Drosophila 


134469 


X17567 


Hs.83753 


small nuclear ribonucleoprolein polypept 


134498 


M63180 


Hs.84131 


threonyURNA synthetase 


134501 


W84870 


Hs.211568 


eukaryotic translation Initiation factor 


134507 


M63488 


Hs.84318 


replication protein A1 (70kD) 


134548 


U41515 


Hs.85215 


Deteted in split-hand/split-foot 1 regio 


134599 


X99226 


Hs.86297 


Fanconi anemia; complementation group A 


134692 


R73567 


Hs.8850 


a disintegrin and metaltoproteinase doma 


134693 


N70361 


Hs.8854 


ESTs 


134806 


Z49099 


Hs.89718 


spermine synthase 


134821 


Z34974 


Hs.198382 


pfakophifin 1 (ectodermal dysplasiafekin 


134864 


Y08999 


Hs.90370 


actin related protein 2/3 complex; subun 


134914 


U29615 


Hs.91093 


chitlnase 1 (chitotriosidase) 


134953 


L10678 


Hs.91747 


profiOn 2 


134993 


AA282343 


Hs.9242 


purine-rich element binding protein B 


135051 


C15324 


Hs.93668 


ESTs 


135158 


U51711 




Human desmocoJDn-2 mRNA; J UTR 
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Table 1 B shows the accession numbers for those pkeys in Table 1 A lacking unigenelD*s. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubJeTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
Accession column. 

Pkey: Unique Eos probeset identifier number 
CAT numben Gene cluster number 
Accession: Genbank accession numbers 



Pkey 

100661 
100667 



CAT 

23182J 
26401J 



100668 26401J 



101332 25130J 



BE623001 L05096 AA383604 AW966416 N53295 M460213 AW571519 AA603655 

105424 X56794 S66400 X55150 W60071 AW351820 X55938 M83326 BE005289 BE070059 M83324 BE005248 BE069717 BE181648 BE069700 
AW606203 BE069721 AW382138 AW803776 BE463954 BE005334 8E005274 T27386 AA932714 AA972695 AW377728 A1632506 T29066 
AI783934 AW377727 BE163715 AL047291 AA279047 AA523003 BE008O48 BE440141 W23614 BE090519 BE092193 N29181 N20358 N44153 
BE546944 T69231 AW377441 AA907406 H50799 AW051416 AI420712 BE620922 A1279161 AA992549 W47198 BE005241 AI342696 H50700 
AI969974 AI863855 AA374490 AW130675AI950633 AA146687 H99482 X55150 BE005414 BE005339 N28294 A167306B AI887890 AW804171 
AI675961 AW804172AA778841 AL048050 A1127757 A1095568 AW204965 AW468978 W31B98 AI052595 AI278771 BE464018 AI081503AI824196 
AA513211 AM110S2 AVW84376 N48752 M7(^ 

AA283090 AA952536 H82726 W521 1 5 W45432 W60433 AA577548 AA 1467 14 BE1 50994 AA05461 5 AW796025 AW382768 BE565671 C00444 
AA054555 

L05424 X56794 S66400 X55150 W60071 AW351820 X55938 MB3326 BE005289 BEO70O59 M83324 BE005248 BE069717 BE181648 BE069700 
AW606203 BE069721 AW3821 38 AW803776 BE463954 BE005334 BE005274 T27388 AA93271 4 AA972695 AW377728 AI632506 T29066 
A1783934 AW377727 BE163715 AL047291 AA279047 AA523003 BE008048 BE440141 W23814 BE090519 BE092193 N29181 N20358 N44153 
BE546944 T69231 AW377441 AA907406 H50799 AW051416 AI420712 BE620922 AI279161 AA992549 W47198 BE005241 AI342696 H50700 
AI969974 AI863855 AA374490 AW130675 AI950533 AA146687 H994B2 X55150 BE005414 BE005339 N28294 AI673068 A1887890 AW804171 
AI675961 AW804172AA776841 AL048050At127757AK)95568 AW204965 AW468978 W31898AI052595Ai278771 BE464018 AI081503AJ824196 
AA51 321 1 AA41 1 062 AW084376 N48752 AA703209 N355B0 AW05991 8 AA054563 AI280942 T27619 BE621 435 N6601 0 AW589527 AI160414 
AA283090 AA962536 H82726 W52115 W45432 W60433 AA577548 AA146714 BE150994 AA054615 AW796025 AW382768 BE565671 C00444 
AA054555 

JO4088 NM_001067 AF071747 AJ01 1741N85424 AL042407 AA218572BE296748BE0^1 AL(M0877AW499918AVV575045H17813 
BE081283 AA670403 AW504327 BE094229 AA 104024 AI471482 AI970337 AA737616 AI827444 AW003286 AI742333 Al 344044 AI765534 



91 



WO 02/086443 PCT/US02/12476 
AI948838 AW235336 AW172827 AA095289 BE046383AT734240 W16699 AI66Q329 AI289433 AA933778 AW469242 AA468838 AA806983 
AA625873 W78031 BE206307 AA550803 AI743147 AI990075 AA948274 AA1 29533 A! 535399 AAB05313 AI624669 AW594319 A1221834 AI337434 
AA307708 BE550282 AT760457 AI630S36 AI221521 AW674314AW078889AI933732AI686969 At 186928 AW074595 All 27485 AL079544 
AJ910815 H17814 AA310903 AW137854 T19279 AA026682AA306035 AW383390 AW383389 AW3B3422 AW383427 AW383395 H09977 
AA306247 AA352501 AW403639 F05421 AA224473 AA305321 H93904 AA089612AW391543AW402915AW173382AW402701 AW403113 
R94438 N731 26 H93468 AA090928 M095051 T29025 AW951071 U7277 L47276 AI375913 BE384156 W24652 AA746288 AA568223 BE090591 
H93033 N57027 AA504348 AA327653 AW959913 N53767 AA843715 AI453437 AW263710 AI076594 AA583483 AW873194 AW5751 66 All 28799 
AI803319 AL042776 AW074313 AI887722 AI032284 AA447521 AM 23885 N29334 AI35491 1 AW090687 AA236763 AA435535 AA235910 
AA047124 AA236734 AW514610 H93467 AA9S2007 AI446783 AA127259 AI613495 AJ686720 AI587374 AA936731 AA702453 AI859757 
AA216786 AI251 81 9 AI469227 AA805022 AI092324 N71868 AA968782 AA236919 AA809450 AA227220 AA765284 AI192007 AA76881 0 
AA805794 AA729280 AA806238 AW768817 N71B79 AI050686 AA505822 AA66B974 AI688160 BE045915 AW466315 AA731314 AA649568 
AA834316 AW591901 AW063876 AW294770 AI300266 AI336094 AI560380 AA721755 H09978 D20305 D29155 AW821790 BE150864 F01675 
AI457474 AW466316 AA550969 AA630788 

100780 458 127 BE561958 BE561728 BE397612 BE514391 BE269037 BE514207 8E562381 BE514256 BE514403 BE514250 BE397832 BE269598 BE559865 
BE396881 BE560031 BE514199 BE560037 BE560454 

100830 40Q2J AC004770 W05005 AA356068 AA094281 H29358 T56781 AW875313 L37374 BE312466 BE311755 BE207106 BE293320 BE0181 15 AW239090 
BE548830 AW247547 AA776062 BE397382 AA486713 T101 1 1 T09340AW498981 BE547280AA356003AW581520AW875331 AA580720 
AW875336 BE276873 BE408229 AW188148 BE255166 BE253761 AW793727 AW373141 AW581548 AA471223 AA30595O BE263976 AA626820 
BE257409 AW360962 AA090655 C00312 BE312741 BE407213 AA209352 AW298199 AW248553 AW297794 AW731722 BE300586 AW731972 
AW615446 BE301599AW615520 M486714AW440257M196516M564530AA618079AW192592AW474985A^ 
AI680394 AL1 35548 AI683224 AI581 1 26 AW245096 AW1941 54 H29274 N70363 AA629758 AA580602 AA862006 AI883841 AI097667 AJ928583 
A1358774 BE243487 AA620553 AA653297 AA292690 T101 1 0 238906 AA908544 AA340930 AJ1 85438 T03328 T28844 A168701 0 AI864965 
AI872575 BE388740 T56780 AW373138 BE258717 AA699671 

100906 4312L1 AU076916BE298110AW239395AW672700NM.003875U10860AW651755BE297958C03806AI795876 M644165T3503 

AA446421 AW881856 AI469428 BE548103 T96204 R94457 N78225 A1564549 AW004984 AW780423 AW675448 AW087B90 AA971454 AA305698 
AA879433 AA535069 AI394371 AA928053 AI378367 N59764 A1364000 AI431285 T81090 AW674657 AW674987 AA897396 AW673412 BE063175 
AW674408 AI202011 R00723 AI753769 AI460161 AW079585AW275744AI873729 D25791 BE537646T81139 R00722 

100930 16865 1 J04129 N^002571 AA293088AA477016AA4O4631 T28299 AA476904AA433965AA430486AA495907A1 151 391 AA29 1495 AA402723 W25651 
AA706816 AI826712 AW296294 AA293479 AI276581 AW044154 AI0801B0 AI417985 AI274168 AI474212 AA495908 AA635664 A10921 14 
AI804952 AA479874 AI597661 AI420511 AA479738 AA421417 AA421247 AA436220 AL047797 M34046 N42277 AA228076 W02698 AI420297 
AA434011 AI369971 AA479731 AI865541 AI418020 AA421246 AA462764 AL048051 

102221 3861 1 NMJ05769 U24576AW161961 AW160473AW160465AW160472AW161069 AI824831 AW1 62635 AI990356 AW1 62477 AW1 62571 AI520836 
AW162352 AW162351 AW162752 AI962216 A1537346 AA853902 H17667 BE045346 BE559802 BE255391 AA985217 AA235051 AI129757 
AW366451 T34489 D56106 D56351 AI936579 AW02321 9 AW889335 AW889120 AW889232 AW8891 75 BE093702 AW889349 AA147546 
AI952998 AA91 2579 Al 143356 AW902211 R6471 7 AW1 57236 AI81 5242 D45274 AW263991 AA442920 AA1 29965 AL035713 AI923255 A1949082 
AI142B26 A1684160 A1701987 A1678954 AI827349 BE463635 AW628092 AW302281 AA493203 BE348856 BE536419 AW193969 AW673561 
AW592609 AI224044 H43943 AA091912 R49632 R48353 AI568409 R48256 A1198046 H27986 H43899 AI678759 A1680310 A1624220 H17052 
AA156410 N56062 AI699430 AA664529 T09406 T10459 AA627506 AI379584 N83831 N88633 AW022651 AA971281 AA248036 AI039197 
AI914689 AA973825 AL047305 AA129966 AI798369 AW264348 AI445879 AI658759 N67924 AI933507 AI216121 AI333174 T10972 AI375028 
AI186756 AI273778 AA610487 AI797946 AA853903 AA903939 AI338587 AI278494 AW627595 AA904019 

1 01809 32963 1 M86849 AA315280 NMJJ04004 AA315269 BE142653 AA461400 AW802042 BE1 52893 AW3831 55 AA490688 AW1 17930 AW384563 AW384544 
AW384566 AW378307 AW378323 AW839085 AA257102 AW378317 AW276060 AW271245 AW378298 AW384497 AI5981 14 AW264544 AI018136 
AW021810 AA961504 AW086214 AW771489 AW192483 AI290266 AW192488 AW384490 AW007451 AW890895 AA554460 AA613715 
AW020066 AI783695 A1589498 A1917637 AW264471 AW3B4491 AI816732 AW368530 AW368521 AW368463 AA461087 AI341438 AI970613 
A1040737 AI41 8400 AA947181 AA962716 AI280695 AW769275 AW023591 AI160977 AA055400 N71 882 AA490466 AW243772 AW316636 
AI076554 AW51 1702 N69323 H88912 AA257017 AI952506 H88913 AJ912481 AA600714 BE465701 N64149 C00523 N64240 AA677120 

102590 15932.1 R61573 BE005029 X98091 AA297307 BE537267 BE566138 BE566139 F11561 BE564795 BE568776 AW054005 BE566479 BE380035 BE567012 
BE568634 BE565568 AA298060 BE566043 BE568813 BE568618 AA283070 BE565414 BE556738 BE568585 BE565667 BE5661 16 BE566433 
U62136 AF049140 BE567057 BE567297 BE567403 BE564316 BE567400 BE568854 BE566583 AA448772 AA071363 AW732642 BE564996 
AA297763 AA278550AA421083 AA298184AA091007 AA984577 AA205916N28759 AL031291 C15757 C15761 H02728 BE566410 AA1 29335 
AA419499 N87741 BE379689 BE004824 BE37961 1 D25874 AA148454 AA323654 AW95031 1 AA448795 AW749423 AA773386 AA773843 
AW020327 BE348580 BE504258 BE549990 BE220200 AI673334 AI202679 AA975515 D61421 AI168688 AA102843 AW246621 AJ276203 
AI074054 A1633824 AI962927 AI148926 N50969 AI30891 1 AA410994 AW373025 AA148455 H02620 AA688293 AI246318 N22220 AI917777 
AI050943 AI097286 AA663794 AW368662 AW627826 AW078734 AI253060 AA7491 54 AA832236 AI1 92358 AW024676 AA448676 AA764891 

• .BE439467 M661534AA258061AI090M6M995157A1051011AA584421AI026032AW591338AW58M^ 

F09219 BE464500 AI383595 AA954244 AA601583 AA737304 AA195549 AA805778 AI055876 AA164942 AW013961 AI672608 AW51421 1 D59441 
AW582574 AA160935 BE566501 BE564612 BE565353 BE568195 BE555447 8E568302 BE566097 BE565470 BE564249 AL036217 AW749424 
BE567494 AA1 02842 AA31 4761 AV661237 C14211 AA651866 AW798997 AA470605 
101977 29073 1 AF1 1221 3 AL050318 T24804 AW248136 BE386341 BE263177 W16677BE250224 BE563669 BE267405 BE546577AV651354 AV651292 

AI346903 A1539128 AI1 89171 S83364 AW073849 AI816760 AW073309 AI422690 AA296692 AI860301 AI805446 N77735 AI340328 BE092530 
AW02B742 BE088442 AA657742 AA742438 AW170086 A1038920 AI432379 N35073 AI936194 AA868655 AA983512 A1077505 BE080433 
A1375014 AI126547 AI348244 AI346077 AI748952 N26915 AI753574 AI093341 AI278762 BE092517 N74204 H0S158 T581 49 AJ 129303 N58365 
AA524456 BE122661 AA542925 AI246120 AI735203AA706829AA877544A1082289 AA926687 N 92840 AW249798 AA934763 AW998363 

• AI128632 N25202 AI240209 AW1 1 8892 NB0744 R35655 AI342321 AI340141 AW878792 AI857321 H0961 0 W04601 AW006550 AA1 26006 
AA553675 AI052791 AW059835AI041 906 AA81 4658 AW002059AA729483AI609301 AA994633AA903651 AI459183 T95072AW088630 
AA126112 AI800091 A1561215 H17502AW475072 AI819003AI683272 AI262701 AW793140T81787 R99588 AI2751 60 Af 31 0420 A1698929 
AA1591 74 AI827968 F303C5 F30309 AAB06662 AI091923 AW878722 AA583430 AW571 913 AI674584 AA292533 AI079471 AA642325 AA71 9050 
AW793172 AA305476 AW103745 T23459 N79525 AI784438 AA534551 AW193751 AI074360 BE281214 T32229 W25066 W01205 T63086 

• AW795348 AI361287 AW795353 AW795349 AA594759 AI400295 D1 1489 A1370689 AA482356 AA485295 W401 51 AA564661 AW300745 
AI346938 AI374975 AI423782 AW193899 AA612604 AI183409 AA996156 AW366963 AW366977 AJ284860 AA846503 AI985064 AAB44576 
AA737921 AA873274 BE241546 BE241540 AA484058 AW468970 AA127876 AA159120 AW001568 AW795213 AW79525B AW795330 BE250589 
BE387572 AA910895 AA1 61 217 BE250380 W31 500 T951 67 AI719306 AI359224 
102781 20812.1 BE258778 BE281230 BE410044 T33723 AW672694 AW410439 NM.006429 AF026292 T35505 BE542333 T08940 AU076737 AW247471 

BE393215 AW32B640 BE542408 T32170 BE302544 T31955 BE206898 BE275738 T32570 BE386426 BE298746 BE389937 BE293991 BE315289 
BE389578 R34739 R15312 BE279365 8E277756 A1036019 T33725 BE277779 BE302962 AL047294 BE276505 T09070 T33673 BE312580 
AW387774 BE257175 AW674367 BE253331 BE270344 BE299831 BE273576 T32062 A1751831 BE618381 AA304899 BE252268 U46364 
BE256790 BE207199 BE256209 BE251941 BE250791 BE313955 BE269806 BE543623 BE279212 BE252289 T31699 BE262220 T31669 
AA315781 AA192212 N84547 BE292737 BE259631 AA232179 AI133144 T31292 AA315945 BE407301 BE251184 BE409006 AI880158 AI904003 
AI904114 AW651768 AW651763 R58247 BE271897 U83843 C05298 BE261609 BE255973 AA351650 N84631 BE263637 AW452910 AA328465 
AA324549 AW579525 BE252295 BE257551 AL048332 BE208630 AA359336 AW327897 AA151742 AA305816 8E076862 BE076796 BE263161 
AA323785 AA676588 AA626565 AA078917 W87657 R09002 R94021 AA312032 BE276665 AA295608 AW407162 AA329374 AW877912 N27885 
AA369256 AA360968 BE250476 K85427 BE265559 AI278639 AI81 6576 AI691 037 AW328583 AI567949 AI983455 AI927732 AI81 1 297 A1571508 
AVV073674BE296039BE467326AI828796AI816578AW511604AI921213AW152427AI795787AI801618AW1688W 
AW173690 AW51 1540 BE535620 AA38301 4 BE301 164 AI866596 AW514909 AA658050 AW575243 AA074631 AI093488 AW57540B AW675443 
AW615636AW32207AVTO77638AA321784AA641629M63^ 
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Tables 2A-8C were previously filed on November 9, 2001 In USSN 60/339.245 (1 8501-004100US) 

Table 2A shows 504 genes down-regulated in lung tumors relative to normal rung and chronically diseased lung. Chronically diseased lung samples represent chronic non- 
maSgnan! rung diseases such as fibrosis, emphysema, and bronchitis. These genes ware selected from 59680 probesets on the Eos/Affymetrix Hu03 GenecHp array. Gene 
expression data for each probeset obtained from this analysis was expressed as average intensity (AO, a normaBzed value reflecting the relative level of mRNA expression. 
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90lh percentile of Al tor normal rung samples divided by the 80th percentile of Al for aderK>carcInoma and squamous cell carcinoma lung tumor 

rrSSrfof Al tor normal lung samples divided by 90th percentile of Al for adenocarcinoma and squamous cell carcinoma lung tumor samples, 
median of Al tor normal lung samples minus the 15th percentile of Al for all normal lung, chronically diseased lung and tumor samples divided by 
the 90th percentile of Al for adenocarcinoma and squamous cell carcinoma lung tumor samples minus the 15th percentile of Al for all normal 
lung, chronically diseased lung and tumor samples. 

average of Al for normal lung samples divided by average Al for squamous cell carcinoma and adenocarcinoma lung tumors, 
median of Al for normal lung samples divided by the 90lh percentile of Al for adenocarcinomas. 
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Hs.270982 


ESTs 


124348 


AI796320 


Hs.10299 


ESTs 


124358 


AW070211 


Hs.102415 


■yw35g1 1 .s1 Morton Fetal Cochlea Homo sa 


124409 


AI814166 


Hs.107197 


ESTs 


124442 


AW663532 


Hs.285625 


TATA box binding protein (TBP)-associate 


124468 


N51413 


Hs.109284 


ESTs 


124479 


AB011130 


Hs.127436 


calcium channel; voltage-dependent; a)ph 


124519 


A1670056 


Hs.137274 


ESTs; Weakly similar to SPUCEOSOME ASSO 


124711 


NMJ04657 


Hs.26530 


serum deprivation response (phosphatidyl 


124866 


AJ768289 


Hs.304389 


ESTs 


124874 


BE550182 


Hs.127826 


ESTs 


125097 


AW576389 


Hs.335774 


ESTs 


125179 


AW206468 


Hs.103118 


ESTs 


125200 


AW836591 


Hs.103156 


ESTs 


125299 


T32982 


Hs.102720 


ESTs 


125400 


AL110151 


Hs.1 28797 


DKFZP586D0824 protein 


125810 


H00083 




ary) hydrocarbon receptor-interacting pr 


126176 


BE242256 


Hs^2441 


KtAA0022gene product 


126303 


D78841 




HUM525A05B Human placenta polyA* (TFuji 


126403 


AW629054 


Hs.1 25976 


ESTs; Weakly similar to metaitoprotease/ 


126507 


AL040137 


Hs.23964 


ESTs; WeaHy similar to HC1 ORF [M.muscu 


126773 


AA648284 


Hs.187584 


ESTs 


127307 


AW962712 


Hs.126712 


ESTs; Weakly similar to plL2 hypothetica 


127462 


AA760776 


Hs.293977 


aa59b04.s1 NCLCGAP.GCB1 Homo sapiens c 


127486 


AW002846 


Hs.105468 


ESTs 


127572 


AA594027 


Hs.191788 


ESTs 


127609 


X80031 


Hs.530 


ESTs 


127832 


AW976035 


Hs.292396 


ESTs 


127898 


AA774725 


Hs.128970 


ESTs 


128073 


AW340720 


Hs.1 25983 


ESTs 


128101 


AA905730 


Hs.128254 


ESTs 


128149 


NMJJ12214 


Hs.177576 


mannosyl (alpha-1;3-)-glycoprotein beta- 


128212 


W27411 


Hs.336920 


glutalhione peroxidase 3 (plasma) 


128333 


W68800 


Hs.1 2126 


ESTs; Weakly similar to LR6 [H.sapiens] 


128364 


N76462 


Hs.269152 


ESTs; Weakly similar to ZINC FINGER PROT 


128426 


AI265784 


Hs.145197 


ESTs 


128598 


AA305407 


Hs.102308 


potassium inwardly-rectifying channel; s 


128634 


AA464918 




ESTs; Moderately similar to HI! ALU SUB 


128687 


AW271273 


Hs.23767 


ESTs 


128726 


A1311238 


Hs.104476 


ESTs 


128773 


NM_004131 


Hs.1051 


granzyme B (granzyme 2; cytotoxic T-lymp 


128833 


W26667 


Hs.184581 


ESTs 


128870 


H39537 


Hs.75309 


eukaryotic translation elongation factor 


128878 


R25513 


Hs.10683 


ESTs 


128885 


AF134803 


Hs.180141 


coffin 2 (muscle) 


128998 


W04245 


Hs.107761 


ESTs; Weakly s&nflar to PUTATIVE RHO/RAC 


129000 


M744902 


Hs.1 07767 


ESTs; Moderately similar to CaM-WI InW 


129038 


AW156903 


Hs.108124 


ribosoma) protein L41 


129098 


AW580945 


Hs.330466 


ESTs 
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15 
20 
25 
30 
35 
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55 
60 
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129210 
129240 
129262 
129301 
129331 
129361 
129585 
129595 
129613 
129782 
129950 
129958 
129959 
130160 
130259 
130273 
130312 
130436 
130523 
130799 
130885 
131002 
131012 
131031 
131061 
131066 
131082 
131087 
131161 
131179 
131182 
131205 
131277 
131281 
131282 
131285 
131355 
131391 
131461 
131487 
131517 
131545 
131583 
131647 
131675 
131676 
131708 
131717 
131756 
131762 
131821 
131839 
131861 
132015 
132070 
132242 
132334 
132476 
132490. 
132533 
132598 
132619 
132652 
132726 
133028 
133071 
133120 
133129 
133147 
133151 
■ 133213 
133276 
133377 
133407 
133535 
133537 
133656 
133689 
133779 
133978 
133985 
134000 
134111 
134185 
134204 



AL039940 

AA361258 

BE222198 

AF182277 

AW167668 

AW2458 05 

X77777 

U09550 

AW978517 

AW016932 

F07783 

R27496 

AL036554 

AA305688 

NMJJ00328 

AW972422 

AF056195 

NMJW1928 

AA999702 

AB028945 

NMJW5883 

AL050295 

AL039940 

NMJJ01650 

N64328 

AW169287 

AI091121 

AF147709 

AF033382 

AA171388 

AI824144 

NMJ03102 

AA131466 

AA251716 

X03350 

A1567943 

R52804 

AW085781 

AA992841 

F13036 

AB037789 

AL137432 

AK000383 

AA359615 

H15205 

AI126821 

S60415 

X94630 

AA443966 

AA744902 

AA017247 

AB014533 

AL096858 

A1418006 

BE622641 

AA332697 

AW080704 

AL1 19844 

NMJ01290 

AI922988 

X80031 

H28855 

N41739 

N52298 

R51604 

BE384932 

NWL003278 

AA428580 

M026533 

NM.014051 

AA903424 

AW978439 

AJ131245 

AF017987 

AL134030 

U41518 

BE149455 

NWL001872 

T58486 

AF035718 

L34657 

AW175787 

A137258B 

AA285136 

AI873257 



Hs.202949 

Hs.237868 

Hs.109843 

Hs.330780 

Hs.279772 

Hs.1 10903 

Hs.198726 

Hs.1 154 

Hs.172847 

Hs.1 04105 

Hs.1369 

Hs.1378 

Hs.274463 

Hs.267695 

Hs.153614 

Hs.153853 

Hs.1 5430 

Hs.155597 

Hs.214507 

Hs.1 2696 

Hs.20912 

Hs.22039 

Hs.202949 

Hs.288650 

Hs.268744 

Hs.22588 

Hs.246218 

Hs.22824 

Hs.23735 

Hs.184482 

Hs.23912 

Hs.2420 

Hs.23767 

Hs.25227 

Hs.4 

Hs.25274 

Hs.25956 

Hs.26270 

Hs,27263 

Hs.27373 

Hs.263395 

Ks.28564 

Hs.323092 

Hs.30089 

Hs.30509 

Hs.30514 

Hs.30941 

Hs.3107 

Hs.31595 

Hs.107767 

Hs.164577 

Hs.33010 

Hs.184245 

Hs.3731 

Hs.38489 

Hs.42721 

Hs.45033 

Hs.49476 

Hs.4980 

Hs.1 72510 

Hs.530 

Hs.53447 

Hs.61260 

KS.5560B 

Hs.300842 

Hs.64313 

Hs.65424 

Hs.65551 

Hs.66 

Hs.94896 

Hs.6786 

Hs.69504 

Hs.7239 

Hs.7306 

Hs.284180 

Hs.74602 

Hs.75415 

Hs.75572 

Hs.222566 

Hs.78061 

Hs.78146 

Hs.334841 

Hs.8022 

Hs.301914 

Hs.7994 
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4.09 



WAA1 102 protein 

interteukin 7 receptor 2.29 

ESTs 3.30 

Human cytochrome P45MIB (h!iB3) mRNA; 

ESTs; Highly similar to CG1-38 protein ( 

ctautfin 5 (transmembrane protein deleted 2.93 

vasoactive intestinal pepSda receptor 1 1 60.80 

oviductal glycoprotein 1; 120kO 10.00 

ESTs; Weakly similar to collagen alpha 1 a40 

EST 9.00 

decay accelerating fector for complement 87.80 

annexin A3 44.60 

defensln; alpha 1 ; myeloid-related seque 2.72 

UDP-Ga!:betaGlcNAc beta 1 ;3-galactosyftr 42.20 

retinitis pigmentosa GTPase regulator 2.54 

MAO (mothers against decapentaplegic; Dr 51 .60 

DKFZP586G1219 protein 3.16 

D component of complement (adipsin) 

ESTs 4.77 
ESTs 6.00 

adenomatous polyposis coil like 3.54 
WAA0758 protein 

KIAA1 102 protein 20.00 
aquaporin4 41.20 
ESTs;M^eratelyslrnilartoKIAA0273[H. . 31.40 
ESTs 29.60 
ESTs; Weakly similar to zinc finger prat 9.00 
ESTs; Weakly similar to p160 myb-binding 

potassium voltage-gated channel; subfami 3. 14 

DKFZP586D0624 protein 3.80 
ESTs 

superoxide dismutase 3; extracellular 2. 98 

ESTs 3.15 
ESTs 32.20 
alcohol dehydrogenase 3 (class 0; gamma 

ESTs; Moderately similar to putative sev 6.40 
DKFZP564O206 protein 8.00 
ESTs 10.00 
butyrate response factor 2 (EGF-response 26.80 

Homo sapiens mRNA; cDNA DKFZp56401763 (f 4.03 
ESTs; Highly similar to semaphorin Via [ 39.00 

ESTs 11.00 * 

ESTs; Weakly similar to dual specificity 10.00 

ESTs 2.47 

ESTs 3.06 

ESTs 45.80 

calcium channel; voltage-dependent; beta . 2.28 

C097 antigen 

ESTs 40.60 
ESTs; Moderately simitar to CaM-Kll inn! 

ESTs 2.87 

K1AA0S33 protein 3.48 
K1AA0929 protein Msx2 Interacting nuclea 54.00 
ESTs 
ESTs 

ESTs 2.68 
lacrimal proline rich protein 4.66 
Homo sapiens clone TUA8 Cri-du-chat regl 34.20 
LIM binding domain 2 2.66 
ESTs 13.00 
collagen; type IV; alpha 3 (Goodpasture 

ESTs; Moderately similar to kinesin Kgh 4.02 
ESTs 3.18 
ESTs; Weakly slmilarto cONA ESTyk484g1 11.43 
ESTs 2.37 

ESTs 2.27 . 

.tetranectin (plasnunogen-binding protein 2.63 

ESTs 

interieukin 1 receptor-Cke 1 6.20 

ESTs 3.69 

ESTs 31.40 

ESTs 9.00 

SEC24(S.cerevisiae) related gene fama 41.20 

secreted frizzled-related protein 1 50.20 

protocadherin 2 {cadherin-like 2) 3.72 
aquapori n 1 (channeMbrming integral pr 

Accession not listed In Genbank 2.65 

carboxypeptidase B2 (plasma) 90.80 

ESTs 3.05 

transcription factor 21 2.92 - 

platelet/endothelia) cell adhesion motec 

selenium binding protein 1 

TU3A protein 4.49 

Homo sapiens mRNA; cDNA DKFZp586K1 220 (f 3.27 
ESTs; Weakly similar to CO-69 protebi ( 40.60 
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134641 AI092634 Hs.156114 

134677 AA251363 Hs. 177711 

134745 NMJJ00685 Hs.89472 

134749 T28499 Hs.89485 

134786 T29618 

134825 U33749 

134978 AJ829008 

135010 N5Q465 

135053 AW796190 Hs.93678 

135081 AF069517 Hs.173993 

135091 AA493650 

135135 AA775910 

135203 C15737 

135236 AJ636208 

135266 R41179 

135346 ML000928 Hs.992 

135378 AW961818 Hs.24379 

135387 NMJJ01972 Hs.99853 

135388 W27965 Hs.99865 
135402 L12398 Hs.99922 



Hs.69640 
Hs.197764 
Hs.333383 
Hs.92927 



Hs.94367 

Hs.95011 

Hs.269386 

Hs.96901 

Hs.97393 



protein tyrosine phosphatase; non-recept 
ESTs 

angiotensin receptor 1B 

carbonic anhydrase IV 

angiopoietin 1 receptor; TEK tyrosine Id 

thyroid transcription factor 1 

ftcoftn (coflagen/Sbrfnogen domain-cont 

ESTs 

ESTs 

RNA binding motif protein 6 
ESTs 

syntrophin; beta 1 (dystrophln-associate 

ESTs 

ESTs 

Human mRNA for KIAA0328 gene; partial cd 
phosphodpase A2; group IB (pancreas) 
potassium voltage-gated channel; shaker- 
elastase 2; neutrophil 
EST 

dopamine receptor 04 



15.00 



3.05 
Z52 



32.20 
57.80 
31.60 



2B.80 



43.00 



37.20 
38.80 
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3.76 



3.73 



3.21 



4.31 



4.21 



4.24 



6.42 



TABLE 2B shows the accession numbers for those primekeys lacking unigeneiD's for Table 2A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
'Accession' column. 

Pkey. Unique Eos probeset identifier number 
CAT number Gene cluster number 
Accession: Genbank accession numbers 



Pkey 

108447 
108550 
108655 
102397 
126303 
125810 
103627 
121366 
114509 
115272 
108338 
108434 
123802 
102310 
102636 
104776 
120504 
113502 



101308 
108629 
103098. 
103241 
103508 
103575 
119514 
121082 
128634 
105817 
121518 
114449 
114648 
121950 
107723 



CAT number Accessions 

43452.-7 AA079126 

120073 1 AA084867AA084996 

127522.1 AA099960AA1 13013 

44371.-1 U41898 

1525933J D78841 078880 

1554054.1 H00083 R81062 

2615.2 Z48513Z48512 

280401 J A1743515AA405617AW276708 . 

116777.1 AA079505AA079537 

1721 13.1 AW015947 AA21 1690 AA279425 

112186J AA070773M070774 

114012.1 AA078899AA078782AA075788 

genbanU\A620448 AA620448 

NOT FOUNDjjntrezJJ33839 U33839 

entrez.U67092 U57092 

genbanLAA026349 AA026349 

genbaru\AA256B37 AA256837 

genban)a89130T89130 

genbankJ\A083103 AA083103 

entreU41390 L41390 . 

genbanLAA102425 AA102425 

221 215 M88361 Z26593 X02850 D13070AE000659 M17649 M87869 M87871 X61077 M16286 AF018169 X61079 S59351 X60142 AF043169 

entrezj(76223 X76223 

entre*_Y10141 Y10141 

entrez_JZ26256 Z26256 

NOT_FOUND_entrez_W37937 W37937 

genbankjHA398722 AA398722 

AA464918_at AA464918 

genbank J AA397825 AA397825 

genbank_M412155 AA412155 

genbanLAA020736 AA020736 

genbanl^AA101056 AA101056 

genbankJ\A4295l5 AA429515 

genbank^AA015967 AA015967 
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Table 3A shows 452 genes up-regufaled In chronlcaOy diseased (ung relative to normal rung. Chronically diseased lung samples represent chronic non-ma&gnant lung diseases 
such as fibrosis, emphysema, and bronchitis. These genes were selected from 59880 probesets on the Eos7Aflymetrix Hu03 Genechip array. Gene expression data for each 
probeset obtained from this analysis was expressed as average Intensity (A!), a normalized value reflecting the relative level of mRNA expression. 

Pksy: Unique Eos probeset Identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnlgenelD: Unlgene number 

Unigene Title: Unigene gene title 

R1: 80th percentile of A) for chronically diseased lung samples divided by the 90th percentile of Al for normal lung samples. 

R2 80th percentile of Al for chronically diseased lung samples divided by the 90th percentile of normal lung samples, squamous cell carcinomas and 

adenocarcinomas 

R3: 70th percentile of Al for chronically diseased lung samples minus the 15th percentile of Al for all normal hingi chronically diseased lung and tumor samples 

divided by the 90th percentile of normal lung samples, squamous cell carcinomas and adenocarcinomas minus the 15th percentile of A) for all normal lung, 
chronically diseased lung and tumor samples 



Pkey 


ExAccn 


UnlgenelD 


Unigene Title 


R1 


135423 


U50531 


Hs.1 38751 


Human BRCA2 region, mRNA sequence CG030 


12.40 


135378 


AW961818 


Ks.24379 


MUM2 protein 




135346 


NMJ100928 


Hs.992 


phospholipase A2, group IB (pancreas) 


12.40 


135235 


AW298244 


Hs.293507 


ESTs 


135057 


U90268 


Hs.93810 


cerebral cavernous malformations 1 


11.67 


134951 


BE305081 


Hs.169358 


hypothetical protein 




134799 


M36821 


Hs.89690 


GR03 oncogene 




134786 


T29618 


Hs.89640 


TEK tyrosine kinase, endothelial (venous 




134772 


NMJJ00829 


Hs.163697 


glutamate receptor, bnotrophlc, AMPA 4 


29.80 


134752 


BE246762 


Hs.89499 


arachidonate 5-llpoxygenase 




134749 


T28499 


Hs.69485 


carbonic anhydrase IV 




134696 


BE326276 


Hs.8861 


ESTs 




134636 


NIVL0055B2 


Hs.87205 


lymphocyte antigen 64 (mouse) homolog, r 


13.60 


134627 


AI018768 


Hs.1 2482 


glyceronephosphate O-acyltransferase 




134622 


AW975159 


Hs.293097 


ESTs, Weakly similar to A55380 faciogenl 




134570 


G66615 


Hs.172280 


SvVI/SNF related, matrix associated, acfj 


13.20 


134561 


U76421 


Hs.85302 


adenosine deaminase, RNA-specific B1 (h 




134468 


NMJ01772 


Hs.83731 


C033 antigen (gp67) 




134417 


NMJJ06416 


Hs.82921 


solute carrier family 35 (CMP-siaBc acl 




134343 


050683 


Hs.82028 


transforming growth factor, beta recepto 




134323 


BE170651 ' 


: Hs.8700 


deleted In liver cancer 1 




134300 


NMJJ01430 


Hs.8136 


endothelial PAS domain protein 1 




134299 


AW580939 


Hs.97199 


complement component C1q receptor 




134253 


X52075 


Hs.80738 


siatophorin (gpL1 15, leukcsialfn, CD43) 


20.60 


134182 


D52059 


Hs.7972 


K1AAC871 protein 


1120 


133985 


L34657 


Hs.78146 


platetet/endothelial cell adhesion motec 




133978 


AF035718 


Hs.78061 


transcription factor 21 




133835 


AI677897 


Hs.76640 


RGC32 protein 




133651 


A1301740 


Hs.173381 


dihydropyrimidinase-like 2 




133633 


D21262 


Hs.75337 


nucleolar and coiled-body phosphprotein 


15.20 


133565 


AW955776 


Hs.313500 


ESTs, Moderately similar to ALU7_HUMAN A 




133548 


AW946384 


Hs.178112 


DNA segment, single copy probe LNS-CAI/L 




133488 


AA335295 


Hs.74120 


adipose specific 2 




133478 


X83703 


Hs.31432 


cardiac ankyrin repeat protein 




133337 


AF085983 


Hs.293676 


ESTs 




133200 


AB037715 


Hs.183639 


hypothetical protein FU10210 




133153 


AF070592 


Hs.66170 


HSKM-B protein 


30.60 


133130 


AI128606 


hb.6557 


zinc finger protein 161 


22.60 


133120 


NMJJD3278 


Hs.65424 


tetranectin (plasmlnogen-blnding protein . 




132926 


AW1 68082 


Hs.1 69449 


protein kinase C, alpha 


13.80 


132836 


AB023177 


Hs.29900 


KIAA0960 protein 




132799 


W73311 


Hs.169407 


SAC2 (suppressor of actin mutations 2, 


41.60 


132742 


AA025480 


Hs.292812 


ESTs, Weakly similar to T33468 hypotheti 


40.40 


132548 


X12830 


Hs.193400 


intarleukin 6 receptor 




132476 


AL1 19844 


Hs.49476 


Homo sapiens clone TUA8 Cri-du-chat regi 




132439 


AK001942 


Hs.4863 


hypothetical protein DKFZp566Al524 




132240 


AB018324 


Hs.42676 


KJAA0781 protein 


21.20 


132210 


NM-0072G3 


Hs.42322 


A kinase (PRKA) anchor protein 2 




132199 


AL041299 


Hs.165084 


ESTs 


15.20 


131751 


T96555 


Hs.31562 


ESTs 




131745 


A1828559 


Hs.31447 


ESTs, Moderately similar to A46010 X-li 


27.80 


131694 


NM_000246 


Hs.3076 


MHC class II transacbvator 




131685 


NM.012296 


Hs.30687 


GRB2-assoc(ated binding protein 2 




131676 


A1126821 


Hs.30514 


ESTs 




131629 


245794 


Hs.238809 


ESTs 


21.40 


131589 


C18825 


Hs^9191 


epithelial membrane protein 2 




131536 


AA019201 


Hs.269210 


ESTs 




131517 


AB037789 


Hs.263395 


sema domain, transmembrane domain (TM), 




131355 


R52804 


Hs.25956 


DKFZP564O206 protein 




131253 


R71802 


Hs.24853 


ESTs 


15.00 


131207 


AF104266 


Hs.24212 


latrophifm 




131156 


AI472209 


Hs.323117 


ESTs 




131066 


AW169287 


Hs.22588 


ESTs 




131061 


N84328 


HsJ68744 


KIAA1796 protein 




131053 


AA348541 


Hs.296261 


guanine nucleotide binding protein (G pr 




130895 


AA641767 


Hs.21015 
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Hs.12940 


z/nc-firtgers and homeoboxes 1 


24.71 






110837 


H03109 


Hs,108920 


HT01 8 protein 






2.18 


110824 


AI767183 


Hs.26942 


ESTs 


12.20 






110776 


AB032417 


Hs.19545 


frizzled (Drosophila) homolog 4 






1.75 


110576 


H60859 


Hs.37889 


ESTs 


13.00 






110359 


AK000768 


Hs.107872 


hypothetical protein FU 20761 




5.60 




110099 


R44557 


Hs.23748 


ESTs 






2.31 


109984 


AI796320 


Hs.10299 


Homo sapiens cDNA FU 13545 fis, done PL 








109958 


AA001266 


Hs.133521 


ESTs 


11.25 






109893 


AA684208 


Hs.30484 


ESTs 






2.68 
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109842 


AW818436 


HSL23590 


solute earner family 16 (monocarboxylfc 


23.83 


109837 


H00656 


Hs.29792 


ESTs, Weakly similar to 138022 hypolheti 




109796 


A1800515 


Hs.12024 


ESTs 




1096B8 


R41900 


Hjk22245 


ESTs 


22.80 


109648 


H17800 


Hs.7154 


ESTs 


109613 


H47315 


Hs.27519 


ESTs 




109550 


AW021488 


Hs^6981 


ESTs 




109523 


AW1 93342 


Hs.24144 


ESTs 




109472 


AK001989 


Hs.91165 


hypolhetical protein 




109355 


AA524525 


Hs.48297 


DKFZP586C1620 protein 


15.00 


109260 


AW978515 


Hs.131915 


K1AA0853 protein 


25.60 


108781 


AA128654 




gbzn98g07.s1 Stratagem fetal retina 93 


14.20 


08663 


BE219231 


Hs^92653 


ESTs, Weakly similar to T26845 hypothec 


11.00 


108573 


AA086005 




gbzl84c04.s1 Stratagene colon (937204) 


26.00 


108480 


AL133092 


Hs.68055 


hypothetical protein DKFZp434l0428 




108382 


NMJ06770 


Hs.67726 


macrophage receptor with collagenous str 


15.20 


J08174 


AA055632 


Hs.303070 


ESTs 


108138 


AL049990 


Hs.51515 


Homo sapiens mRNA; cDNA DKFZp564G1 1 2 (fr 


15.44 


108087 


AA045708 


Hs.40545 


ESTs 


08048 


AI797341 


Hs.165195 


Homo sapiens cDNA FU14237 fis, clone NT 




08041 


AW204712 


Hs.61957 


ESTs 




07997 


ALD49176 


Hs.82223 


chordin-lite 




07994 


AA036811 


Hs.48469 


UM domains containing 1 




107922 


8E153855 


Hs.61460 


Ig superfamily receptor LNIR 


14.20 


107681 


BE379594 


Hs.49136 


ESTs, Moderately similar to ALU7JMMAN A 


51.80 


07666 


AA010611 


Hs.60418 


EST 


29.20 


107332 


T87750 


Hs.183297 


DKFZP566F2124 protein 


10.73 


107292 


BE166479 


Hs.4789 


Homo sapiens serologically defined breas 


32.00 


107230 


AI034467 


Hs.34650 


ESTs 


17.40 


107168 


W57578 


Hs.237955 


RAB7, member RAS oncogene family 


10.43 


107160 


AA314490 


Hs.27669 


KIAA1563 protein 


11.40 


07054 


AI076459 


Hs.15978 


KIAA1272 protein 




107029 


AF264750 


Hs.288971 


myeloid/lymphoid or mixed-lineage teukem 


21.40 


106999 


H93281 


Hs.10710 


hypotheOca) protein FU2041 7 


35.80 


106954 


AF128847 


Hs.204038 


Indolethylamine N-methyltransferase 




106870 


A1983730 


Hs.26530 


serum deprivation response (phosphatidyl 


13.40 


06865 


AW192535 


Hs.19479 


ESTs 


106844 


AA485055 


Hs.158213 


sperm associated antigen 6 




06820 


NM_016831 


- Hs.12592 


period (Orosophiia) homolog 3 




106818 


AK002135 


Hs.3542 


hypothetical protein FU11273 


13.00 


106797 


A1768801 


Hs.169943 


Homo sapiens cDNA RJ13569 fis, clone PL 




106773 


AA478109 


Hs.188833 


ESTs 




106747 


NM.007118 


Hs.171957 


triple functional domain (PTPRF interact 


12.60 


06743 


BE613328 


Hs.21938 


hypothetical protein FU 12492 


10.60 


106667 


AW360847 


Hs.16578 


ESTs 




106605 


AW772298 


Hk21103 


Homo sapiens mRNA; cDNA OKFZp564B076 (fr 




106567 


AW450408 . 


Hs.86412 


chromosome 9 open reading frame 5 




106562 


AL031846 


Hs.152151 


plakophlUn 4 




(06536 


AA329648 


Hs.23604 


ESTs, Weakly simHar to PN0099 son3 prol 


23.20 


106533 


AL134708 


Hs.145998 


ESTs 


106507 


AA259068 


H3.267819 


protein phosphatase 1, regulatory (inhib 


15.20 


106490 


AA404265 


Hs.1 15537 


putative dipeptidase 


10.44 


106474 


BE383668 


Hs.42484 


hypothetical protein FU10616 


106211 


AA428240 


Hs.126083 


ESTs 




105986 


AB037722 


Hs.8707 


KIAA1301 protein 




105894 


A1904740 


Hs.25691 


receptor (calcitonin) activity modifying 




105847 


AW964490 


Hs.32241 


ESTs, WeaWy simitar to S65657 a!pha-10 




105803 


AW747996 


Hs.160999 


ESTs, Moderately similar to A56194 throm 




105731 


AA834664 


Hs.29131 


nuclear receptor coactivator 2 


10.71 


105729 


H46612 


Hs.293815 


Homo sapiens HSPC285 mRNA, partial cds 


23.40 


105688 


A1299139 


Hs.17517 


ESTs 


105510 


Z42047 


Hs.283978 


Homo sapiens PR02751 mRNA, complete cds 


37.20 


105101 


H63202 


Hs.38163 


ESTs 




104989 


R65998 


Hs,285243 


hypothetical protein FU22029 




104986 


AW088826 


Hs.1 17176 


poiy(A)-bindifig protein, nuclear 1 




104969 


AI670947 


Hs.78406 


phosphatidyilnositol-4-phosphate 5-ktnas 




104903 


AI436323 


Hs.31141 


Homo sapiens mRNA for KIAA1568 protein, 




104896 


AW015318 


Hs.23165 


ESTs 


13.80 


104865 


T79340 


Hs.22575 


Homo sapiens cDNA: FU21042 fis, clone C 




104825 


M035613 


Hs.141883 


ESTs 




104781 


AA099904 


Hs.21610 


DKFZP434B203 protein 




104776 


AA026349 




gb:zj99f01.s1 Soares_pregnanLuterusJJbH 




104691 


U29690 


Hs.37744 


Homo sapiens beta-1 adrenergic receptor 




104667 


A1239923 


Hs.30098 


ESTs 




104404 


H58762 




gb:EST00057 HE6W Homo sapiens cDNA clone 




104392 


AA076049 


Hs.274415 


Homo sapiens cDNA FU10229 fis, clone HE 


27.20 


104212 


AB002298 


Hs.173035 


K1AA0300 protein 




104074 


AL162039 


Hs.31422 


Homo sapiens mRNA; cDNA DKFZp434M229 (fr 


11.20 


103749 


AL135301 


Hs.8768 


hypothetical protein FLM0849 


10.86 


103645 


AW246253 


Hs.7043 


succinate-CoAEgase, GDP-fwrning, alpha 


12.00 


103554 


A1878826 


Hs.323469 


caveoiln 1, caveolae protein, 22kD 




103541 


AI815501 


Hs.79197 


C083 antigen {activated 8 lympnocyies, i 




103498 


Y09267 


Hs.132821 


flavin containing monooxygenase 2 




103428 


BE383507 


Hs.78921 


A kinase (PRKA) anchor protein 1 


11.20 


103353 


X89399 


Hs.119274 


RAS p21 protein activator (GTPase activa 


19.80 



3.91 



1720 
9.60 



6.00 



1.83 



3.60 
11.40 
4.76 



7.13 
7.00 



29.80 
3.70 



8.30 
8.09 

5.40 
7.60 



10.20 
5.69 
3.82 
4.20 



1.76 



205 



2.40 
1.78 
1.76 
2.19 



1.94 
1.75 
2.47 



1.92 



1.87 
1.93 



1.91 



1.80 
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103295 


X81479 


Hs.2375 


103280 


U84722 


HsJ6206 


103100 


NMJJ05574 


Hs.1 84585 


103025 


NIO02837 


Hs.123641 


102898 


M18667 


Hs.1867 


102659 


BE245169 


Hs.211610 


102580 


U50308 


Hs. 152981 


102417 


AA034127 


Hs.1 53487 


102363 


NMJJ03734 


Hs.193241 


102302 


AA306342 


Hs.69171 


102283 


AW161552 


Hs.83381 


102188 


U20350 


Hs.78913 


102151 


T27013 


Hs.3132 


101957 


L28824 


Hs.74101 


101842 


M93221 


Hs.75182 


101771 


NM.002432 


Hs. 153837 


101764 


A1198550 


Hs.81256 


101716 


AF050658 


Hs.2563 


101678 


M62505 


Hs.2161 


101447 


M21305 




101383 


NM-000132 


Hs.79345 


101346 


AI738616 


Hs.77348 


101345 


NMJ05795 


Hs.1 52175 


101336 


NO06732 


Hs.75678 


101330 


L43821 


Hs.80261 


101277 


BE297626 


Hs.296049 


101262 


L35854 




101168 


NH.005308 


Hs.211569 


101102 


NM.003243 


Hs.79059 


101088 


X70697 


Hs.553 


101066 


AW970254 


Hs.889 


100971 


BE379727 


Hs.83213 


100893 


BE245294 


Hs.1 80789 


100770 


W25797.comp 


Hs.177486 


100716 


X89887 


Hs.172350 


100555 


M69181 




100425 


NAL014747 


Hs.78748 


100408 


D86640 


Hs.56045 


100382 


D83407 


Hs.1 56007 


100351 


D64158 




100299 


D49493 


Hs.2171 


100134 


AA305746 


Hs.49 


100108 


U09577 


Hs.76873 


100095 


Z97171 


Hs.78454 


100066 







egMke module containing, mudn-IBca, 
cadherin 5, type 2, VE-cadherin (vascuia 
UM domain only 2 {rhombotin-like 1) 
protein tyrosine phosphatase, receptor t 
progastricsin (pepsinogen C) 
CUG triplet repeat, RNA-binding protein 
COP-dlacylgtycerol synthase (phosphatlda 
signal transducing adaptor molecule (SH3 
amine oxidase, copper containing 3 {vase 
protein kinase C-Jtke 2 
guanine nucleotide binding protein 11 
chemokine (C-X3-C) receptor 1 
steroidogenic acute regulatory protein 



160 



man nose receptor, C type 1 
myeloid cell nuclear differentiation ant 
SI 00 caldum-blnalng protein A4 (calcium 
tachykinin, precursor 1 (substance K, su 
complement component 5 receptor 1 (C5a I 



coagulation factor VIII, procoagulantco 
hydroxyprostagland'm dehydrogenase 15-(N 
calcitonin receptor-like 
FBJ murine osteosarcoma viral oncogene h 
enhancer of tilamentation 1 (cas-Gke do 
mtcrofibrillar-associated protein 4 
gb:Human dystrophin (dp140) mRNA, 5* end 
G protein-coupled receptor kinase 5 
transforming growth factor, beta recepto 
solute carrier family 6 (neurotransmltte 
Charot-Leyden crystal protein 
fatty acid binding protein 4, adipocyte 
S164 protein 

amyloid beta (A4) precursor protein (pro 
HIR Oilstone cell cycle regulation defec 
gb:Human nonmuscle myosin heavy chaln-B 
KIAA0237 gene product 
sre homology three (SH3) and cysteine ri 
Down syndrome critical region gene 1-iik 

growth differentiation factor 10 
macrophage scavenger receptor 1 
hyaturonoglucosaminidase 2 
myocflln, trabecular meshwork inducible 



1.76 
il5 



11.00 
25.40 
14.00 

10.86 



16.40 
15.40 



18.80 
504.80 



19.00 



19.38 

15.40 
11.20 
14.80 
33.00 
16.20 



7,40 



31.00 



7.52 



1.78 
2.22 

1.75 
2.24 

2.01 

1.91 



4.00 
4.24 
6.20 
21.20 



5.40 



1.79 



11.29 



TABLE 3B shows the accession numbers for those primekeys lacking unigenelD's for Table 3A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DouWeTwist, Oakland CaBfomia). The Genbank accession numbers for sequences comprising each cluster are fisted In the 
'Accession" column. 

Pkey: Unique Eos probeset identifier number 
CAT number Gene cluster number 
Accession: Genbank accession numbers 



Pkey CAT number Accessions 



123619 371681J 

126433 127143J 

125831 1522905J 

126816 122973J 

126852 136135J 

121059 273450J 

120637 200885.1 

122011 7617.-2 

120934 177521J 

123802 genbankJ\A620448 

116814 genbank_H50834 

118329 genbank_N63520 

104404 H58762J* H58762 

104776 genbankJW)26349 

113502 genbank_T89130T8913Q 

101262 entrez.135854 L35854 

108573 genbankuAA0860O5 

101447 entrezJ/121305 M21305 

124357 genbank_N22401 

108781 genbankJW28654 

112794 genbanK.R97018 

100351 entrezj>64158 D64158 

100555 Bgr HT2245 M69181 M81105 U51039 



AA602964AA609200 
AA325606AA099517N89423 
H04043 D60988D60337 
AA248234AA090985 
AA399961AA128347 
AA393283AA398628 

AAB11804 AA809404 AA288907 AW977624 
M431082 

AA2261 98 AA226513 AA383773 
AA62044B 
H50834 
N63520 



AA026349 



AA086005 

N22401 

AA128654 

R97018 
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Table 4A shews 202 genes up-regulated to samples from patients treated with chemotherapy or radiotherapy. These genes were selected from 59680 probesets on the 
Eos/AnVmetrix Hu03 Genechip array. Gene expression data for each probeset obtained from this analysis was expressed as average intensity (Al), a normalized value reflecting 
the relative level of mRNA expression. 

5 Pkey: Unique Eos probeset identifier number 

ExAccn; . Exemplar Accession number, Genbank accession number 
UnigenelD: Unigene number 
Unigene Title: Unigene gene title 

R1: average of Al for samples from patients treated with chemotherapy or radiotherapy divided by the average of At for normal lung samples. 



Pkey ExAccn UnigenelD Unigene Title R1 

100113 NM.001269 Hs.84746 chromosome condensation 1 27.20 

100187 D17793 Hs.78183 aldo-keto reductase family 1, member C3 20.60 

15 100210 D26361 Hs.3104 WAAQ042 gene product • 20.40 

100225 D28539 Hs.167185 ghitamate receptor, metabotropic 5 20.60 

100269 N^001949 Hs.1189 E2F transcription factor 3 29.40 

100438 AA013051 Hs.91417 topoisomerase (DNA) II binding protein 23.50 

100877 X80821 Hs.27973 WAA0874 protein 35.56 

20 100893 BE245294 Hs.180789 S164 protein 43.40 

101273 Z11933 Hs.182505 POU domain, class 3. transcription facto 21.80 

101447 M21305 gb:Human alpha satellite and satellite 3 193.60 

101649 AW959908 Hs.1690 heparin-binding growth factor binding pr 38.40 

101724 L1 1690 Hs.620 bullous pemphigoid antigen 1 (2307240kD) 198.80 

25 101748 NM-001944 Hs.1925 desmogtein 3 (pemphigus vulgaris antigen 78.60 

101809 M86849 Hs.323733 gap juncUon protein, beta 2. 26kD (conn 162.20 

101879 M176374 Hs.243886 nuclear autoantigenlc sperm protein (his 50.00 

101915 AF207881 Hs.155185 cytosolic ovarian carcinoma antigen 1 26.00 

101973 U41514 Hs.80120 UDP-N-acetyl^pha-D^alactosamine:polyp 37.20 

30 102025 UW045 Hs.78934 mutS (E. coli) homotog 2 {colon cancer, 

102031 U04898 Hs.2156 RAR-related orphan receptor A 32.00 

102052 NM.002202 Hs.505 ISL1 transcription factor, LIM/homeodoma 51.20 

102391 AA296874 Hs.77494 deoxyguanosine kinase 13.90 

102420 U44060 Hs.14427 Homo sapiens cDNA: FU21800 fis, clone H 28.80 

35 102610 U65011 Hs.30743 preferentially expressed antigen in mela 110.60 

102829 NML006183 Hs.80962 neurotensin . 116.80 

103000 NM.001975 Hs.1465B0 enolase 2, (gamma, neuronal) " ■ 2.30 

103036 M13509 Hs.83169 matrix metalloproteinase 1 (interstitial 181.40 

103507 AJ000512 Hs.296323 serum/glucocorticoid regulated kinase 49.20 

40 103587 BE270266 Hs.82128 574 oncofetal trophobiast glycoprotein 86.60 

104660 BE298665 Hs.14846 Homo sapiens mRNA; cDNA DKFZp564D016 (fr 42.60 

104896 AW015318 Hs.23165 ESTs 29.40 

105038 AW503733 Hs.9414 KIAA1 488 protein 21.50 

105298 BE387790 Hs.26369 hypotheOcal protein FU20237 32.80 

45 105510 Z42047 Hs.283978 Homo sapiens PR02751 mRNA, complete cds 20.20 

105667 AA767526 Hs.22030 paired box gene 5 (B-celi lineage specif 28.40 

106073 AL157441 Hs.17834 downstream neighbor of SON 25.40 

106205 AW965058 Hs.1 11583 ESTs, Weakly similar to 138022 hypotheti 32.00 

106516 AI137311 Hs.234074 Homo sapens mRNA; cDNA DKFZp761G02121 { 40.60 

50 106533 AL134708 Hs.145998 ESTs 59.80 

106575 AW970602 Hs.105421 ESTs 43.40 

106654 AW075485 Hs.286049 phosphoserine aminotransferase 50.80 

106851 AI458623 gb:tk04g09.x1 NCI_CGAP_Lu 24 Homo sapiens 53.40 

106995 AB023139 Hs.37892 K1AA0922 protein 20.88 

55 107332 T87750 Hs.183297 DKF2P566F21 24 protein 23.60 

107532 AA443473 Hs.173684 Homo sapiens mRNA; cDNA DKFZp762G207 (fr 57.20 

107922 BE153855 Hs.61460 Ig superfamily receptor LNIR 49.00 

108609 BE409857 Hs.69499 hypothetical protein 19.67 

108780 AU076442 Hs.117938 collagen, type Xvll, alpha 1 48.17 

60 109166 AA219691 Hs.73625 RAB6 interacting, kinesin-like (rabkines 59.20 

109260 AW978515 Hs.131915 WAA0863 protein 28.60 

109280 AK001355- Hs.279610 hypothetical protein FU 10493 22.80 

109292 AW975746 Hs.188662 KIAA1 702 protein 

109384 AA219172 Hs.86849 ESTs 21.00 

65 109415 U80736 Hs.1 10826 trinucleotide repeat containing 9 31.60 

109445 AA232103 Hs.189915 ESTs 24.20 

109502 AW967069 Hs.211556 hypothetical protein MGC6487 21.40 

109633 AW003785 Hs.170267 ESTs 20.40 

109786 AI989482 Hs.146286 kinesln family member 13A 19.60 

70 109958 AA001266 Hs.133521 ESTs 24.00 

110920 N47224 Hs,20521 HMT1 (hnRNP methyltransferase, S. cerevl 28.40 

110924 AW058463 Hs.12940 zinc-fingers and homeoboxes 1 36.00 

111084 H44186 Hs.15456 PDZ domain containing 1 61.20 

111132 AB037807 Hs.83293 hypotheOcal protein 24.60 

75 111229 AW389845 Hs.110855 ESTs 27.20 

111337 AA837396 Hs.263925 US1-interac1ing protein NUDE1, rat homo 48.00 

111987 NNL015310 Hs.6763 WAA0942 protein 37.80 

112046 AA383343 Hs.22116 CDC14 (ceB division cycle 14, S, cerevl 26.80 

112268 W39609 Hs.22003 solute carrier family 6 (neurotransmitte 63.80 

80 112685 R87650 Hs.33439 ESTs, Weakly similar to ALU1 JHUMAN ALU 26.40 

112871 AL110216 Hs.12285 ESTs, Weakly similar to 155214 salivary 47.64 

112897 AW206453 Hs.3782 ESTs 22.00 

112973 AB033023 Ks.318127 hypothetical protein FU 10201 65.00 

112992 AL157425 Hs.133315 Homo sapiens mRNA; cONADKFZp761 J 1324 (f 42.00 

85 113073 N39342 Hs.103042 microtubule-associated protein 1B 55.40 
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113494 


T91451 


Hs.86538 


ESTs 


22.80 


113560 


T91015 


Hs268626 


ESTs 


22.80 


113849 


AA457211 


Hs.8858 


bromodomain adjacent to zinc finger doma 


51.80 


113950 


AI267652 


Hs.30504 


Homo sapiens mRNA; cDNA DKFZp434E082 (fir 


2820 


114339 


AA782845 


Hs.22790 


ESTs 


2020 


114365 


H42169 


Hs.18653 


hypothetical protein FU 14627 


21.00 


114455 


H37908 


Hs271616 


ESTs, Weakly slmOar to ALU8.HUMAN ALU S 


25.80 


114518 


AW163267 


Hs.106469 


suppressor of varl (S.cerevislae) 3-like 


23.60 


114824 


AA950961 


Hs.305953 


zinc finger protein 83 (HPF1) 


2720 


114837 


BE244930 


Hs.166895 


ESTs 


3020 


114974 


AW966931 


Hs.179662 


nucleosome assembly protein 1-iike 1 


20.80 


115075 


AA814043 


Hs.88045 


ESTs 


30.60 


115084 


BE383668 


Hs.42484 


hypothetical protein FU10618 


28.86 


115291 


BE545072 


Hs.122579 


hypothetical protein FU10461 


38.00 


115313 


AA808001 


Hs.184411 


albumin 


22.60 


115697 


D31382 


Hs.63325 


transmembrane protease, serine 4 


173.60 


115909 


AW872527 


Hs.59761 


ESTs, Weakly similar to QAP1.HUMAN DEATH 


27.77 


116090 


AI591147 


Hs.61232 


ESTs 


20.80 


116107 


AL133916 


Hs.172572 


hypothetical protein FU20OB3 


164.20 


116399 


AA889120 


Hs.1 10637 


homeoboxAIG 


38.00 


117099 


H93699 




gb:yv16a1 1 .si Soares fetal liver spleen 


21.60 


117881 


AF161470 


Hs.260622 


butyrate-irtduced transcript 1 


49.40 


118091 


AW005054 


Hs.47883 


ESTs. Weakly similar to KCC1_HUMAN CALC1 


22.40 


118138 


AA374756 


Hs.93560 


Homo sapiens mRNA for KIAA1771 protein, 


22.00 


118720 


N73515 




gbza49d07.s1 Soares fetal liver spleen 


20.00 


118873 


AI824009 


Hs.44577 


ESTs 


19.40 


119126 


R45175 


Hs.117183 


ESTs 


111.20 


119717 


AA918317 


Hs.57987 


B-ceff CIL/fymphoma 1 1B (zinc finger pro 


33.00 


119940 


AL050097 


Hs.272531 


DKF2P586B0319 protein 


31.00 


120266 


AI807264 


Hs.205442 


ESTs, Weakly similar to T34036 hypottieti 


20.20 


120515 


AA258356 




gb:zr59c10.s1 Soares_NhHMPu_S1 Homosapi 


25.00 


120859 


AAS26434 


Hs.1619 


achaete-scute complex (Drosophila) homoi 


95.40 


120983 


AA398209 


Hs.97587 


EST 


105.20 


121054 


AW976570 


Hs.97387 


ESTs 


38.80 


121369 


AW450737 


Hs.128791 


CGI-09 protein 


41.60 


122335 


AA443258 


Hs.241551 


chloride channel, calcium activated, fam 


30.80 


122612 


AA974832 


Hs.128708 


ESTs 


19.60 


123130 


AA487200 




gb:ab19f02,s1 Stratagene hing (937210) H 


33.20 


123440 


A1733692 


Hs,112488 


ESTs 


23.17 


123596 


AA421130 


Hs.1 12640 


EST 


23.00 


123619 


AA602964 




gb:no97cOZs1 NCI_CGAP^Pr2 Homo sapiens 


28.80 


124006 


AI147155 


Hs.270016 


ESTs. 


77.60 


124169 


BE079334 


Hs.271630 


ESTs 


2220 


124281 


AI333756 


Hs.1 11801 


arsenate resistance protein ARS2 


4220 


124472 


N52517 


Hs.102670 


EST 


32,60 


124617 


AW628168 


Hs.1 52684 


ESTs 


21.80 


124631 


NMJJ14053 


Hs.270594 


FLVCR protein 


30.40 


124839 


R55784 


Hs.140942 


ESTs 


2120 


125186 


AA610620 


Hs.181244 


major histocompatibility complex, class 


42.80 


125321 


T86652 


Hs.178294 


ESTs 


27.00 


125535 


NWL013243 


Hs.22215 


secretogranin III 


23.60 


125646 


AA628962 


Hs.75209 


protein kinase (cAMP-dependeni, cataiyti 


2320 


125684 


AW589427 


Hs.1 58849 


Homo sapiens cDNA FLJ21663 fis, clone C 


21,20 


125724 


AL360190 


Hs.295978 


Homo sapiens mRNA full length insert cDN 


48.80 


125847 


AW161885 


Hs.249034* 


ESTs 


31.00 


125934 


AA193325 


Hs.32646 


hypothetical protein FU21 901 


2120 


126077 


M78772 


Hs.210836 


ESTs 


49.80 


126299 


AW979155 


Hs298275 


amino acid transporter 2 


21.80 


126395 


AI468004 


Hs.278956 


hypothetical protein FU12929 


71.00 


126433 


AA325606 




gb:EST28707 Cerebellum II Homo sapiens c 


2320 


126509 


R47400 


Hs.23850 


ESTs 


23.80 


126538 


AB030656 


Hs.17377 


coronin, actin-binding protein, 1C 


23.10 


126666 


AA648886 


Hs.1 51999 


ESTs 


36.00 


126812 


AB037860 


Hs.173933 


nuclear factor I/A 


20.80 


126872 


AW450979 




gb:UI-H-BI3^a-a-12-0-U!.s1 NCI_CGAP_Su 


46.29 


127048 


AA321948 


Hs.293968 


ESTs 


22.60 


127431 


AW771958 


Hs.175437 


ESTs, Moderately similar to PC4259 fern 


30.00 


127489 


AA650250 


Hs.272076 


ESTs 


20.80 


127521 


AW297206 


Hs.164018 


ESTs 


2520 


127742 


AW293496 


Hs.180138 


ESTs 


28.00 


127925 


AA805151 


Hs.3628 


mttogen-activated protein kinase kinase 


21.20 


127930 


AA80SB72 


Hs.123304 


ESTs 


20.54 


127968 


M83O201 


Hs.124347 


ESTs 


2820 


127987 


AI022103 


Hs.124511 


ESTs 


19.60 


128116 


H07103 


Hs.286014 


Homo sapiens, clone !MAGE:38S7243, mRNA 


20.40 


128609 


NMJ03616 


Hs.102456 


survival of motor neuron protein interne 


34.40 


128777 


AI678918 


Hs.10526 


cysteine and giytine-rfen protein 2 


53.80 


128949 


AA009647 


Hs.8850 


a disintegrin and metafloprotemase doma 


23.00 


129168 


AI132988 


Hs.109052 


chromosome 14 open reading frame 2 


37.60 


129404 


A1267700 


Hs.317584 


ESTs 


28.60 


129527 


AA769221 


Hs270847 


deita-tubufin 


40.80 


129574 


AA026815 


Hs.1 1463 


UMP-CMP kinase 


3120 


129598 


N30436 


Hs.11556 


Homo sapiens cDNA FU12566 fis, done NT 


29.60 


129785 


H19006 


Hs.184780 


ESTs 


72.20 


129970 


AV655B06 


Hs.296198 


chromosome 12 open reading frame 4 


2220 
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130149 


AW0S7805 


Hs.172665 


130199 


248579 


Hs.172028 


130441 


US3630 


Hs.155637 


130466 


W19744 


Hs.180059 


130482 


AW409701 


Hs.1578 


130617 


M90516 


Hs.1674 


130703 


R77775 


Hs.18103 


130732 


AW890487 


Hs.63984 


130887 


NH001072 


Hs.284239 


131028 


A1879165 


Hs.2227 


131086 


AUD35461 


Hs.2281 


131284 


NML001429 


Hs.25272 


131775 


AB014548 


Hs.31921 


131860 


BE383676 


Hs.334 


131945 


NMJ02916 


Hs.35120 


132040 


NMJ01196 


Hs.315689 


132084 


NM.002267 


Hs.3885 


132389 


AA310393 


Hs.190044 


132437 


AA152106 


Hs.4859 


132550 


AYV969253 


Hs.170195 


132617 


AF037335 


Hs.5338 


132632 


AU076916 


Hs.5398 


132672 


W27721 


Hs.54697 


132742 


AA025480 


Hs.292812 


132771 


Y10275 


Hs.56407 


133070 


U92649 


Hs.64311 


133153 


AF070592 


Hs.66170 


133181 


X91662 


Hs.66744 


133282 


AA449015 


Hs.286145 


133350 


AI499220 


Hs.71573 


133592 


AV652066 


Hs.75113 


133658 


AA319146 


Hs.75426 


133865 


AB011155 


Hs.170290 


134032 


NM.005025 


Hs.78589 


134125 


NMJM4781 


Hs.50421 


134158 


U15174 


Hs.79428 


134321 


BE538082 


Hs.8172 


134367 


AA339449 


Hs.82285 


134570 


U66615 


Hs.172280 


134753 


NM.006482 


Hs.173135 


135002 


AA448542 


Hs.251677 


135029 


H58818 


Hs.187579 


135047 


AL134197 


Hs.93597 


135345 


X53655 


Hs.99171 



melhytenetetrahydrofolata dehydrogenase 29.60 

a disintegrin and rnetaitoproteinase doma 27.60 

protein kinase, DNA-acfivated, catalytic 23.36 

Homo sapiens cDNA FU20653 fe. done KA 20.20 

bacutoviral IAP repeat-containing 5 (sur 22.40 

grutamlrte-fructose-6-phosphate transamtn 19.60 

ESTs • 19.40 

cadnerin 13, H-cadherin (heart) 21.40 

UDP glyccsyl transferase 1 family, polype 1 1 0.00 

CCAAT/enhancer binding protein (OEBP), 25.20 

chromogranln B (secretogranin 1) 40.60 

E1A binding protein p300 24.60 

KIAA0648 protein 21.00 

Rho guanine nucleotide exchange factor [ 33.40 

repJIcatfon factor C (activator 1) 4 (37 60.80 

Homo sapiens cDNA: FU22373 fe, clone H 20.40 

karyopherina3pha3{importina!pha4) 29.40 

ESTs 32.40 

cyc!tnLania-6a 27.40 

bone morphogenetic protein 7 (osteogenic 75.60 

carbonic anhydrase XII 31 .36 

guanine monphosphate synthetase 32.40 

Cdc42 guanine exchange factor (GEF) 9 23.40 

ESTs, Weakly similar to T33468 hypotheti 61.20 

phosphoserine phosphatase 22.33 

a disintegrin and metalloproteinase doma 23.50 

HSKM-B protein 30.00 

twist (Drosophila) homoiog (acrocephalos 23.80 

SRB7 (suppressor of RNA polymerase B, ye 51 .60 

hypothetical protein FU10074 33.00 

general transcription factor IIIA 82.00 
secretogranin II (chromogranln C) 

discs, large (Drosophila) homoiog 5 69.33 

serine (or cysteine) proteinase inhibit© 33.20 

KIAA0203 gene product 31.60 

BCL2/adenovirus E1B 19kCMnteracting pro 30.60 

ESTs, Moderately similar to A46010 X-Tin 23.40 

phosphoribosylglycinamide formyltransfer 49.20 

SWI/SNF related, matrix associated, acti 20.20 

dual-specificity tyrosine-(YHjhosphoryl 20.80 

G antigen 7B 37.60 

hydroxysteroid (17-beta) dehydrogenase 53.40 

cyclin-dependent kinase 5, regulatory su 31 .60 

neurotropfiin 3 28.80 



TABLE 4B shows the accession numbers for those primekeys tacking unlgenel D*s for Table 4A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs, These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubteTwist Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed In the 
"Accession" column. 

Pkey: Unique Eos probeset identifier number 
CAT number Gene cluster number 
Accession: Genbank accession numbers 



Pkey 


CAT number 


Accessions 


123619 


371681.1 


AA602964AA609200 


126433 


127143J 


AA325606AA099517N89423 


126872 


142696J 


AW450979 AA136653 AA1 36656 AW419381 AA984358 AA492073 BE168945 AA809054 AW238038 BE01 1 212BE011359 




BE011367 BE011368 8E011362 BE011215 BE011365 BE011363 


106851 


322947 J 


AI458623 AA639708 AA485409 R22065 AA485570 


116720 


genbankji73515 M73515 


120515 


genban)LAA258356 AA258356 


117099 


321871J 


H93699 H97976 H80036 


101447 


entrez_M21305 


M21305 


123130 


genbank_AA487200 AA487200 



110 



10 



15 
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Table 5A shows 680 genes unregulated in squamous cell carcinoma or adenocarcinoma lung tumors relative to normal lung and chronically diseased lung. These genes were 
selected from 59680 probesets on the Eos/Afrymetrix Hu03 Genechip array. Gene expression data for each probeset obtained from this analysis was expressed as average 
intensity (AI), a normalized value reflecting the relative level of mRNA expression. 

Pkey: Unique Eos probeset identifier number 

ExAccrc Exemplar Accession number, Gen bank accession number 

UnigenelD: Unigene number 

Unigena Tttie: Unigene gene title 

R1 : 70 th percentile of AI for squamous ceO carcinoma and adenocarcinoma lung tumor samples divided by the 90th percentile of A) for norma) and chronically 

diseased lung samples. 

R2 80th percentile of AI adenocarcinoma lung tumor samples divided by the 90lh percentile of A! for normal and chronically diseased lung samples. 

R3: 80th percentile of At squamous cell carcinoma lung tumor samples divided by the 90th percentile of AI for normal and chronically diseased lung samples. 

R4: 80th percentile of A) adenocarcinoma lung tumor samples divided by the 80th percentile of AI for squamous cell carcinoma lung tumor samples. 

R& ' 70lh percentile of Ai for squamous cell carcinoma and adenocarcinoma lung tumor samples minus the 1 5th percentile of AI for all normal lung, chronically 

diseased lung and tumor samples divided by 90th percentile of AI for normal and chronically diseased lung samples minus the 1 5th percentile of AI for all 

normal lung, chronically diseased lung and tumor samples 
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Pkey 


ExAccn 


UnigenelD 


100035 






100036 






100037 






100071 


A28102 




100114 


X02308 


Hs.82962 


100154 


H60720 


Hs.81892 


100187 


017793 


Hs.78183 


100188 


AW247090 


Hs.57101 


100202 


BE294407 


Hs.99910 


100216 


AA489908 


Hs.1390 


100269 


NM_001949 


Hs.1189 


100287 


AU076657 


,Hs.160O 


100297 


AU077258 


Hs.182429 


100330 


AW410976 


Hs.77152 


100335 


AW247529 


Hs.6793 


100360 


W70171 


Hs.75939 


100372 


NMJH4791 


Hs.184339 


100474 


NMJ00699 


Hs.300280 


100486 


T19006 


Hs.10842 


100491 


056165 


Hs.275163 


100516 


D90278 


Hs.11 


100522 


X51501 


Hs.99949 


100559 


NM.000094 


Hs.1640 


100576 


X00356 


Hs.37058 


100629 


AA015693 


Hs.21291 


100661 


BE623001 


Hs.132748 


100677 


AA353866 


Hs.57813 


100696 


D14887 


Hs.1 21686 


100709 


N26539 


Hs.1 00469 


100761 


BE208491 


Hs.295112 


100830 


AC004770 


Hs.4756 


100867 


U14622 




100902 


M16029 


Hs.287270 


100906 


AU076916 


Hs.5398 


100960 


J00124 


Hs.117729 


101045 


J05614 




101061 


NM.000175 


Hs.180532 


101071 


L02840 


Hs.84244 


101124 


L10343 


Hs.112341 


101175 


U82671 


Hs.36980 


101181 


BE262621 


Hs;73798 


101204 


L24203 


Hs.82237 


101210 


L29301 


Hs.2353 


101216 


AA284166 


Hs.84113 


101228 


AA333387 


Ms.82916 


101233 


AL135173 


Hs.878 


101273 


Z11933 


Hs.182505 


101342 


U52112 


Hs.182018 


101346 


AI738616 


Hs.77348 


101369 


NM.000892 


Hs.1901 


101396 


BE267931 


Hs.78996 


101431 


BE185289 


Hs.1076 


101448 


NWL000424 


Hs.195850 


101462 


AL035668 


Hs.73853 


101466 


8E262660 


Hs.170197 


101484 


AA053486 


Hs.20315 


101502 


M26958 




101505 


AA307680 


Hs.75692 


101526 


NR.002197 


Hs.154721 


101535 


X57152 


Hs.99853 


101577 


M34353 


Hs.1041 


101649 


AW959908 


Hs.1690 


101663 


NM.003528 


Hs.2178 


101664 


AA436989 


Hs.121017 


101669 


L24498 


Hs.80409 



Unigene Title 

AFFX control: GAPDH 
AFFX control: GAPDH 
AFFX control: GAPDH 
Human GABAa receptor alpha-3 subunit 
thymidylate synthetase 
WAA0101 gene product 
aldo-keto reductase family 1 , member C3 
minlchromosome maintenance deficient (S. 
phosphofructokinase, platelet 
proteasome (prosome, macropain) subunit, 
E2F transcription factor 3 
chaperonin containing TCP1, subunit 5 (e 
protein disulfide isomerase-related prot 
minichromosome maintenance deficient (S. 
platelet-acfivating factor acetylhydrola 
uridine monophosphate kinase 
KIAA0175 gene product 
amylase, alpha 2A; pancreatic . 
RAN, member RAS oncogene family 
non-metastatic cells 2, protein (NM23B) 
cardnoembryonic antigen-related cell ad 
prdactin-induced protein 
collagen, type VII, alpha 1 (epidermolys 
calcitonln/calcitonin-related polypeptid 
mitogen-activated protein kinase kinase 
Homo sapiens ribosomal protein L39 mRNA, 
zinc ribbon domain containing, 1 
general transcription factor HA, 1 (37k 
myeloid/lymphoid or mixed-lineage leukem 
WAA0618 gene product 
flap structure-specific endonuclease 1 
gb:Human transketolase-like protein gene 
ret proto-oncogene (multiple endocrine n 
guanine monphosphate synthetase 
keratin 14 (epidermolysis bullosa simple 
gb;Human proliferating ceD nuclear anQ 
glucose phosphate Isomerase 
potassium voltage-gated channel, Shah-re 
protease inhibitor 3, skin-derived (SKAL 
melanoma antigen, family A, 2 
macrophage migration inhibitory factor ( 
ataxia-telangiectasia group D-associated 
opioid receptor, mu 1 
cyclin-dependent kinase inhibitor 3 (CDK 
chaperonin containing TCP1, subunit 6A ( 
sorbitol dehydrogenase 
POU domain, class 3, transcription facto 
interteukin-1 receptor-associated kinase 
hydroxyprostaglandin dehydrogenase 15-(N 
kalfikrein B, plasma (Fletcher factor) 1 
proliferating cell nuclear antigen 
small proGne-rich protein 1B (comifin) 
keratin 5 (epidermolysis bullosa simplex 
bone morphogenetic protein 2 
glutamic-oxaloacetic transaminase 2, mil 
interferon-induced protein with tetratri 
gb:Human parathyroid hormone-related pro • 
asparagine synthetase 
acorutasel, soluble 
fibrillarin 

v-ros avian UR2 sarcoma virus oncogene h 
heparin-bindlng growth factor binding pr 
H2B histone family, member Q 
H2A histone family, member A 
growth arrest and DNA-damage-tnducibte, 



R1 R2 R3 



R4 



8.00 



184 

3.33 



2.55 



5.07 



3.10 
3.85 



Z57 

3.12 
3.50 

4.08 

2.53 

8.50 

3.24 
8.31 

10.50 
4.02 



54.00 

5.59 

7.00 



7.20 



3.60 



7.60 

10.20 
8.00 



12.91 



24.80 



6.40 



15.65 



14.20 

9.30 
20.60 



10.00 



R5 

6.76 
5.77 
5.75 

5.71 



4.52 
5.49 
5.67 

5.66 
3.81 
4.60 

4.82 
3.79 

5.49 
4.17 



5.16 

4.69 
4.19 



21.89 
12.80 



38.80 
12.00 

9.09 



7.90 
4.45 

4.17 



7.90 

4.01 

4.46 
4.65 



7.60 
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101695 


M69136 


Hs.1 35626 


101724 


L11690 


Hs.620 


101748 


NM.001944 


Hs.1925 


101759 


M80244 


Hs.184601 


101771 


NM.002432 


Hs.153837 


101804 


M86699 


Hs.169840 


101809 


M86S49 


Hs.323733 


101833 


AU076442 


Hs.1 17938 


101842 


M93221 


Hs.75182 


101B51 


BE260964 


Hs.82045 


102002 


NM.002484 


Hs.81469 


102039 


AL134223 


Hs.306098 


102072 


U09410 


Hs.78743 


102083 


T35901 


Hs.75117 


102111 


L36198 


Hs.81884 


102123 


NNL001809 


Hs.1594 


102154 


U17760 


Hs.75S17 


102193 


AL036335 


Hs.313 


102217 


AA829978 


Hs.301613 


102224 


NMJKJ2810 


Hs.148495 


102234 


AW163390 


Hs.278554 


102251 


NMJKM398 


Hs.41706 


102305 


AL043202 


Hs.90073 


102330 


BE298063 


Hs.77254 


102340 


U37055 


Hs.278657 


102348 


U37519 


Hs.87539 


102368 


U39817 


Hs.36820 


102394 


NMJI0381& 


Hs.2442 


102404 


NMJJ05429 


Hs.79141 


102537 


U57094 


Hs.50477 


102581 


AU077228 


Hs.77256 


102605 


A1435128 


Hs.181369 


102610 


U65011 


Hs.30743 


102623 


AW249285 


Hs.37110 


102642 


AA205847 


Hs.23016 


102654 


AV649989 


Hs.24385 


.102659 


BE245169 


Hs.211610 


102669 


U71207 


Hs.29279 


102672 


U72066 


Hs.29287 


102687 


NMJI07019 


Hs.93002 


102696 


BE540274 


Hs.239 


102768 


U82321 




102781 


BE258778 


Hs.1 08809 


102784 


U85658 


Hs.61796 


102824 


U90916 


Hs.82845 


102829 


NM.006183 


Hs.80962 


102888 


AI346201 


Hs.76118 


102B92 


BE440042 


Hs.83326 


102913 


NMJ02275 


Hs.80342 


102935 


. BE561850 


Hs.B0506 


102951 


X15218 


Hs.2969 


102983 


BE387202 


Hs.1 18638 


.103023 


AW500470 


Hs.1 17950 


103036 


M13509 


Hs.83169 


103038 


AA926960 


Hs.334883 


103060 


NMJJ05940 


Hs.155324 


103099 


AI693251 


Hs.8248 


103119 


X63629 


Hs.2877 


103168 


X53463 


Hs.2704 


103185 


NNLQ06825 


Hs.74368 


103192 


M22440 


Hs.170009 


103223 


BE275607 


Hs.1708 


103242 


X78342 


Hs.389 


103316 


X83301 


Hs.324728 


103375 


NMJW5982 


Hs.54416 


103376 


AL036166 


Hs.323378 


103385 


NM.007069 


Hs.37189 


103391 


X94453 


Hs.1 14366 


103404 


BE394784 


Hs.78596 


103430 


BE564090 


Hs.20716 


103446 


X98834 


Hs.79971 


103476 


Y07701 


Hs.293007 


103477 


AJ011812 


Hs.1 19018 


103478 


BE514982 


Hs.38991 


103515 


Y10275 


Hs.56407 


103558 


BE616547 


Hs.2785 


103580 


AA328046 


Hs.46405 


103587 


BE270266 


Hs.82128 


103594 


A1368680 


Hs.816 


103636 


NM.006235 


Hs.2407 


103768 


AF086009 




103841 


AA314821 


Hs.38178 


103847 


AF219946 


Hs.102237 


103913 


AW967500 


Hs.1 33543 


104094 


AA418187 


Hs.330515 



chymase 1, mast cell 
bullous pemphigoid antigen 1 (2307240kD) 
desmogieln 3 (pemphigus vulgaris antigen 
solute earner family 7 (cat'onic amino 
myeloid cell nuclear differentiation ant 
TTK protein kinase 

gap junction protein, beta 2, 26kD (conn 
collagen, type Xvli, alpha 1 
mannose receptor, C type 1 
midWne (neurite growth-promoting factor 
nucleotide binding protein 1 (Exoli Min 
aido-keto reductase family 1, member CI 
zinc finger protein 131 (clone pHZ-10) 
interieukin enhancer binding 'factor 2, 4 
sulfotransferase famHy, cytosolrc, 2A, 
centromere protein A (17kD) 
laminin, beta 3 (nicein (125kD), kaiinin 
secreted phosphoprotein 1 (osleopontin, 
JTV1 gene 

proteasome (prosome, macropain) 26S subu 
heterochromatin-Oke protein 1 
OEAO/H (Asp-Glu-Ala-Asp/His) box polypep 
chromosome segregation 1 (yeast homolog) 
chromobox homolog 1 (Drosophila HP1 beta 
macrophage stimulating 1 (hepatocytegro 
aldehyde dehydrogenase 3 family, member 
Bloom syndrome 

a dislntegrin and metalioprateinase doma 
vascular endothelial growth factor C 
RAB27A, member RAS oncogene family 
enhancer of zeste (Drosophila) homolog 2 
ublquitin fusion degradation 1-like 
preferentially expressed antigen in mela 
melanoma antigen, family A, 9 
G protein-coupled receptor 
Human hbc647 mRNA sequence 
CUG triplet repeat RMA-binding protein 
eyes absent (DrosophIJa) homolog 2 
retinoblastoma-binding protein 8 
ubiquitln carrier protein E2-C 
forkhead box Ml 

gb:Homo sapiens clone 14.9B mRNA sequenc 
chaperonin containing TCP1, subu nit 7 (e 
transcription factor AP-2 gamma (acuvat 
Homo sapiens cDNA: HJ21930 hs, clone H 
neurotensin 

ubiquitin carboxyl-termina) esterase L1 
matrix metalloproteinase 3 (stromelysin 
keratin 15 

small nuclear ribonudeoprotein polypept 
v-ski avian sarcoma viral oncogene homo! 
non-metastatic cells 1 , protein (NM23A) 
multifunctional polypeptide similar to S 
matrix metalloproteinase 1 (interstitial 
CDC28 protein kinase 1 
matrix metalloproteinase 11 (stromelysin 
NAOH dehydrogenase (ubiquinone) Fe-S pro 
cadherin 3, type 1 , P-cadherin (placenta 
glutathione peroxidase 2 (gastrointestin 
transmembrane protein (63kD), endoplasmi 
transforming growth factor, alpha 
chaperonin containing TCP1, subunit 3 (g 
alcohol dehydrogenase 7 (class IV), mu o 
SMA5 

sine ocufis homeobox (Drosophila) homoto 
coated vesicle membrane protein 
similar to rat HREV1 07 
pyrroRne-5-carboxylate synthetase (glut 
proteasome (prosome, macropain) subunit, 
translocase of inner mitochondrial membr 
sal (Drosophila}-like 2 
aminopeptidase puromycin sensitive 
transcription factor NRF 
StOO calcium-binding protein A2 
phosphoserine phosphatase 
keratin 17 

polymerase (RNA) tl (DMA directed) polyp 
5T4 oncofetal trophoblast glycoprotein 
SRY (sex determining region Y)-box 2 
POU domain, class 2, associating factor 
gb:Homo sapiens fuD length insert cDNA 
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hypothetical protein DKFZp566N034 
estrogen receptor binding site associate 
RNA polymerase I subunit 
cdk inhibitor p21 binding protein ' 
heme-regulaied initiation factor 2-alpha 
hypothetical protein MGC4816 
nudear receptor subfamily 1, group I, m 
Homo sapiens cDNA: FU21933 fes, clone H 
Homo sapiens mRNA; cDNA DKFZp564O016 (fr 
ESTs, Highly similar to S60712 band-6-pr 
cAMP responsive element modulator 
NPO002 protein 
hypothetical protein FU 12691 
mitotic spindle coiled-coll related prot 
chromosome 20 open reading frame 1 
hypothetical protein FU12934 
hypothetical protein MGC14833 
HBV associated factor 
ESTs, Weakly similar to 138022 hypotheti 
hypothetical protein NUF2R 
ER01 (S.cerevisiaeHike 
cytosketeton associated protein 2 
gb:zr57e08.s1 SoaresJJhHMPu_S1 Homosapi 
hypothetical protein FU20287 
DiGeorge syndrome critical region gene 8 
Homo sapiens, clone 1MAGE:4 179986, mRNA, 
paired box gene 5 (B-ceD lineage specif 
sema domain, Immunoglobulin domain (Ig), 
B-ceD CLLflymphoma 1 1 B (zinc ringer pro 
ESTs 

heat shock 90kD protein 1 , alpha 
McKusick-Kaufman syndrome 
ESTs, Weakly similar to G02075 transcrip 
downstream neighbor of SON 
hypothetical protein FU13352 
hypothetical protein FU 10439 
mitochondrial ribosomat protein L36 
ESTs, Weakly similar to ALU1.HUMAN ALU S 
high-mobility group (nonhistone chromoso 
ESTs, Weakly simitar to putative p 150 [ 
cleavage and polyadenylation specific fa 
hypothetical protein, estradloMnduced 
glutamate-cysteme irgase, catalytic sub 
tyrosylprotein sulfotransferase 1 
ESTs 

Homo sapians mRNA; cDNA DKFZp564B076 (fr 
phosphoserine aminotransferase 
deleted in lymphocytic leukemia, 1 
CG-07 protein 

hypothetical protein FU11269 

M-phase phosphoprotein 9 

ESTs, Weakly similar to ALU5 HUMAN ALU S 

KIAA1272 protein 

RA051 (S. cerevisiae) homolog (E cofi Re 
ESTs 

nucleolar protein 1(1 20kD) 

flap structure-specific endonudease 1 

KIAA1 040 protein 

programmed cell death 2 

DKF2P586E1621 protein 

solute carrier family 6 (neurotransmltte 

Homo sapiens done 24416 mRNA sequence 

fibrfllarin 

nucleolar protein (KKE/D repeat) 

Homo sapiens, clone IMAGE:3603836, mRNA, 

EST 

keratin 6B 

ig superfamily receptor LNIR 
hypothetical protein FU21620 
protein kinase NYD-SP15 
ESTs 

hypothetical protein FU 12572 
hypothetical protein FU1 1210 
ESTs 

gbzm61eQ6.r1 Stratagene fibroblast (937 
gb:zm86a08 si Stratagene ovarian cancer 
hypothetical protein DKFZp434l0428 
gb:zn13b09.s1 Stratagene hNT neuron (937 
gbzi84c04.s1 Stratagene colon (937204) 
Homo sapiens cDNA FU11448 hs, clone HE 
hypothetical protein FU20285 
MAA1077 protein 
ESTs 

ESTs, Moderately similar to 21 09260A B c 
collagen, type XVII, alpha 1 
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108610 


AW295647 


Hs.71331 


hypothetical protein MGC5350 


8,50 


108816 


AA1 30884 


Hs.270501 


w/vv ii ■ » I \\ t All IA 111 It 1 A ft. 1 

ESTs, Moderately similar to ALU2_HUMAN 




108857 


AK001468 


Hs.62180 


anHUn (Drosophlla Scraps homolog), act 


4.00 


108860 


AA133334 


Hs.129911 


ESTs 


6.09 


108937 


AL050107 


Hs.24341 


transcriptional co-acfivator with PDZ-N 


3.00 


109010 


NMJHJ7240 


Hs.44229 


dual specificity phosphatase 12 


2.69 


109121 


BE389387 


Hs.49767 


NAOH dehydrogenase (ubiquinone) Fe-S pro 




109166 


AA219691 


Hs.73625 


RAB6 Interacting, Idnesin-like (rabkines 


10.58 


109227 


AA766998 


Hs.85874 


Human ONA sequence from clone RP1 1-1 6L21 




109415 


U80736 


Hs.1 10826 


trinucleotide repeat containing 9 




109418 


AI866946 


Hs.1 61707 


ESTs 




109454 


AA232255 


Hs.295232 


ESTs, Moderately similar to A46010 X-0 




109502 


AW967069 


Hs.211556 


hypothetical protein MGC5487 




109543 


AA564994 


Hs.222851 


ESTs 




109648 


H17800 


Hs.7154 


ESTs 




109680 


AB037734 


Hs.4993 


WAA1 313 protein 




109700 


F096Q9 




gb:HSC33H092 normalized infant brain cDN 




109704 


AI743880 


Hs.1 2876 


ESTs 




109792 


R49625 




gb:yg61f03.s1 Soares infant brain 1NIB H 




109981 


BE546208 


Hs.26090 


hypothetical protein FU20272 


4.00 


109998 


AL042201 


Hs.21273 


transcription factor NYD-sp10 




110039 


H 11938 


Hs.21907 


histone acetyltransferase 




110156 


AA581322 


Hs.4213 


hypothetical protein MGC16207 




110500 


M907723 


Hs.36962 


ESTs 


4.50 


110551 


AW450381 


Hs.1 4529 


ESTs 




110561 


AA379597 




HSPC150 protein similar to ubiquitin-con 


3.06 


110854 


BE612992 


Hs.27931 


hypothetical protein FU 10607 similar to 




110886 


AW274992 


Hs.72249 


three-PDZ containing protein similar to 




110916 


BE178102 


Hs.24349 


ESTs 




111003 


N52980 


Hs.83765 


dihydrofolate reductase 




111337 


AAB37398 


Hs.263925 


U SI -interacting protein NUDE1, rat homo 


2.54 


111434 


R01608 


Hs.1 42736 


ESTs 




111439 


AI476429 


Hs.1 9238 


ESTs 




111540 


U82670 


Hs.9786 


zinc finger protein 275 




111597 


R11499 


Hs.1 89716 


ESTs 




111895 


T80581 


Hs.12723 


Homo sapiens clone 25153 mRNA sequence 




111929 


AF027208 


Hs.1 12360 


promhin (mouselike 1 




112054 


R43590 




gb:yc85g02.s1 Soares infant brain 1 NIB H 




112210 


R49645 


Hs.7004 


ESTs 




112244 


AB029000 


Hs.70823 


KIAA1077 protein 


2.99 


112382 


R59904 




gb.7hD7g12s1 Soares infant brain 1NIB H 




112392 


R60763 


Hs.1 93274 


ESTs, Moderately similar to I57588 HSre) 




112442 


AA280174 


Hs.285681 


WiBiams-Beuren syndrome chromosome regi 


3.00 


112539 


R70318 


Hs.339730 


ESTs 




112772 


A1992283 


Hs.35437 


ESTs, Moderately similar to 138026 MUM 6 




112869 


BE261750 


Hs.4747 


dyskeratosis congenita 1 , dyskerin 




112935 


R71449 


Hs.268760 


ESTs 


2.73 


112970 


AA694010 


Hs.6932 


Homo sapiens cforte 23809 mRMA sequence 




112973 


AB033023 


Hs.318127 


hypothetical protein FLJ10201 


11.50 


112992 


AL1 57425 


Hs.1 3331 5 


Homo sapiens mRNA; cDNA DKFZp761J1324 (f 




113063 


W15573 


Hs.5027 


ESTs, WeaWy similar to A47582 B-cell gr 


15.00 


113073 


N39342 


Hs.103042 


mlcrotubule-associated protein 1B 




113078 


T40444 


Hs.1 18354 


CAT56 protein 




113238 


R45467 


Hs.189813 


ESTs 




113591 


T91881 


Hs.200597 


KIAA0563 gene product 




113702 


T97307 




gb:ye53h05.s1 Soares fetal liver spleen 


25.00 


113844 


AI369275 


Hs.243010 


Homo sapiens cDNA FU14445 fe, clone HE 




113984 


R96696 


Hs.35598 


ESTs 




114073 


R44953 


Hs.22908 


Homo sapiens mRNA; cDNA DKFZp434J1027 {f 
pyruvate dehydrogenase phosphatase 




114162 


AF155661 


Hs.22265 


3.42 


114208 


AL049466 


HsJ859 


ESTs 




114251 


H15261 


Hs.21948 


ESTs 




114285. 


R44338 


Hs.22974 


ESTs 




114313 


H18456 


Hs.27946 


ESTs 




114339 


AA782845 


Hs.22790 


ESTs 




114407 


BE539976 


Hs.1 03305 


Homo sapiens mRNA; cDNA DKFZp434B0425 (f 




114560 


AI452469 


Hs.1 65221 


ESTs 




114699 


M127386 




gb:zn90d09.r1 Straiagene lung carcinoma 




114767 


AI859865 


Hs.1 54443 


minichromosome maintenance deficient (S 


3.21 


114793 


AA158245 




gb:zo76c03.s1 Stratagene pancreas (93720 




114833 


AI417215 


Hs.87159 


hypothetical protein FU12577 




115047 


BE270930 


Hs.82916 


chaperonin containing TCP1, subunit 6A ( 




115060 


AF052693 


Hs.1 98249 


gap junction protein, beta 5 (connexin 3 




115097 


AA256213 


Hs.72010 


ESTs 




115113 


AA256460 




gb2r81a04.s1 SoaresJJhHMPu_S1 Homosapi 




115123 


AA256641 


Hs.236894 


ESTs, Highly similar to S02392 alpha-2-m 




115134 


AW958073 


Hs.194331 


ESTs, Highly similar to A55713 inositol 




115291 


BE545072 


Hs.1 22579 


hypothetical protein FU10461 


25.00 


115347 


AA356792 


Hs.334824 


hypothetical protein FU14825 




115414 


AA662240 


Hs.283099 


AF15q14 protein 
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Hs.33^93 


c-Myc target JP01 
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A1142336 


Hs.43977 


Human DNA sequence from clone RP11-196N1 




115645 


A1207410 


Hs.69280 


Homo sapiens, clone IMAGE:3636299, mRNA, 


4.17 
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AW016811 


Hs.234478 


Homo sapiens cDNA: FU22648 fis, clone H 
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115652 


BE093589 


Hs.38178 


115697 


D31382 


Hs.63325 


115793 


AA424883 


Hs.70333 


115816 


BE042915 


Hs.287588 


115892 


AA291377 


Hs.50831 


115906 


AI767756 


Hs.82302 


115909 


AW872527 


Hs.59761 


115965 


AA001732 


Hs.173233 


115978 


AL035864 


H8.69517 


115985 


AA447709 


Hs.268115 


116090 


A1591147 


Hs.61232 


116096 


AA682382 


Hs.59982 


116127 


AF126743 


Hs.279884 


116157 


BE439838 


Hs.44298 


116190 


AI949095 


Hs.67776 


116278 


NM_003686 


Hs.47504 


116335 


AK001100 


Hs.41690 


116496 


AW450694 


Hs21433 


116503 


AI925316 


Hs.212617 


116674 


AT768015 


Hs.92127 


116929 


AA586922 


Hs.80475 


116973 


AI702054 


Hs.166982 


116993 


A1417023 


Hs.40478 


117079 


H92325 




117317 


AI263517 


Hs.43322 


117326 


N23629 


Hs.241420 


117396 


W20128 


Hs.296039 


117412 


N32536 


Hs.42645 


117519 


N32528 


Hs.146286 


117693 


AW179019 


Hs.112110 


117721 


N461O0 


Hs.93939 


117881 


AF161470 


Hs.260622 


117903 


AA768283 


Hs.47111 


117992 


AI015709 


Hs.172089 


118013 


AI674126 


Hs.94031 


118017 


AI813444 


Hs.42197 


118186 


N22886 


Hs.42380 


118325 


AI858065 


Hs.166184 


118367 


N64269 


Hs.48946 


118368 


N64339 


Hs.48956 


118472 


AL157545 


Hs.42179 


118709 


AA232970 


Hs.293774 


119025 


BE003760 


Hs.55209 


119027 


AF086161 


Hs.114611 


119052 


R10889 




119164 


AF221993 


Hs.46743 


119186 


AI979147 


Hs.101265 


119243 


T12603 




119490 


AA195276 


Hs.263858 


119499 


AI918906 


Hs.55080 


119599 


W45552 




119780 


NM-016625 


Hs.191381 


119845 


W79123 


Hs.58561 


119941 


AA699485 


Hs.58896 


119994 


AA642402 


Hs.59142 
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W67353 


Hs.170218 


120104 


AK000123 


Hs.1 80479 


120294 


AK000059 


Hs.153881 


120486 


AW368377 


Hs.137569 


120599 


AA804448 


Hs.104463 


120699 


AI683243 


Hs.97258 


120715 


AA292700 




120821 


Y19062 


Hs.96870 


120859 


AA826434 


Hs.1619 


120880 


AA360240 


Hs.97019 


120983 


AA398209 


. Hs.97587 


121034 


AL389951 


Hs.271623 


121121 


AA399371 


Hs.189095 


121313 


AA402713 


Hs.97872 


121369 


AW450737 


Hs.128791 


121376 


AA448103 


Hs.187958 


121476 


AA412311 


Hs.97903 


121509 


AA868939 


Hs.97888 


121553 


AA412488 


Hs.48820 


121753 


AK000552 


Hs.323518 


121838 


AA425680 


Hs.98441- 


121857 


BE387162 


Hs.280858 


121991 


AA430058 


Hs.98649 


122089 


AW016543 


Hs.98682 


122105 


AW241685 


Hs.98899 


122163 


AA435702 


Hs.98829 


122318 


AA429743 




122335 


AA443258 


Hs.241551 


122338 


AA443311 


Hs.98998 


122414 


AB13473 


Hs.99087 



hypothetical protein FU 23468 

transmembrane protease, serine 4 

hypothetical protein MGC10753 

Homo sapiens cONA RJ 13675 fis, clone PL 

ESTe 

Homo sapiens cDNA FU14814 fis, clone NTT 

ESTs, Weakly similar to DAP1.HUMAN DEATH 

hypothetical protein FLJ10970 

cOMA for differentially expressed C01 6 g 

ESTs, WeaWy similar to T08599 probable 
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AW834050 


Hs.9973 


tensin 




132906 


BE613337 


1 t A A J AAA 

Hs.234896 


geminin 


3.09 


132959 


AW014195 


Hs.61472 


r»A^» 1 Al— .1.1.. _* !l i_ V# A A V/l - A fST 1 IV/LA^T* 

ESTs, Weakly similar to YAE6_YEAST HYPOT 




132962 


AA576635 


Hs.6153 


CGI-48 protein 


3.50 


132990 


X77343 


1 1 AA jAAi 

Hs.334334 


transcription factor AP-2 alpha (activat 


6.18 


132994 


A A 4 4 J A 

AA1 12748 


Hs.279905 


\ 1 (AAA 4 A n O/A AA 4 A— 4 

clone HQ0310 PRO0310p1 


3.19 


J Art AAK 

133000 


A 1 Ait A J A 

AL042444 


Hs.62402 


p21/Cdc42/Rac1 -activated kinase 1 (yeast 


2.96 


133050 


X73424 


Hs.63788 


propionyl Coenzyme A carboxylase, beta p 


o cr 
2.55 


133083 


BE24458B 


Hs.6456 


chaperonin containing TCP1, subunit 2 (b 




133086 


L17131 


Hs.139800 


high-mobility group (nonhistone chromoso 




133134 


AF1 98620 


Hs.65648 


RNA binding motif protein 8A 




133155 


M56583 


Hs.662 


cerebeflin 1 precursor 




133181 


X91662 


Hs,66744 


twist (Drosophlla) homolog (acrocephalos 


3.00 


133204 


BE267696 


t 1 A A A 4 AC 

Hs.254105 


enotase 1, (alpha) 




133412 


U41493 


Hs.73112 


guanine nucleotide binding protein (G pr 




133421 


A r*4 A J J A A 

AF134160 


Hs.7327 


claudin 1 


1 DC 

2.65 


133451 


AW970026 


Hs.73818 


ubiquinol-cytochrome c reductase hinge p 




133453 


AI659306 


Hs.73826 


protein tyrosine phosphatase, non-recept 




133504 


NMJJ04415 


II— f jn i p 

Hs.74316 


desmoplafon (DPI, DPI!) 


6,14 


133506 


BE562958 


Hs.74346 


hypothetical protein MGC14353 




133615 


1 J A AO A A 

M62843 


1 1— 7rA1A 

Hs. 75236 


ELAV (embryonic letna), abnormal vision, 




133627 


. NM_002047 


Hs.75280 


glycyl-tRNA synthetase 




133549 


1 inrA a A 

U25849 


Hs.75393 


acid phosphatase 1, soluble 




AAA AAA 

133669 


NMJJ06925 


Hs.166975 


splicing factor, argjnine/serine-rich S 




133749 


L20852 


Hs.1 001 8 


solute carrier family 20 (phosphate tran 




133776 


BE268649 


11-4 7TfAA 

Hs. 177766 


Ann _?v- ji— __c—._ . /LI a r\ . . /Ann 

ADP-noosyltransferase (NAD+; poly (ADP- 




133865 


AB011155 


Hs.1 70290 


discs, large (Drosophlla) homolog 5 


3.07 


133946 


AJ001258 


Hs. 173878 


NIPSNAP, C. elegans, homolog 1 




133973 


N55540 


1 1— "TOAAA 

Hs.78026 


ESTs, Weakly similar to similar to ankyr 




134047 


n r™ AAArAA 

BE262529 


Hs.78771 


phosphoglycerate kinase 1 




4 A JAAA 

134098 


BE513171 


Hs.79086 


mitochondrial rtyosomal protein L3 


2.56 


134107 


kill rtnr/~on 

NM_005629 


11- 4S1AFB 

Hs. 187958 


solute carrier family 6 (neurotransmitte 




134112 


AW449809 


Hs.79150 


chaperonin containing TCP1, subunit 4 (d 




134158 


U15174 


Hs.79428 


BCL2/adenovirus E1B 19kD-lnteracting pro 


31.00 


134160 


T98152 


Hs.79432 


fibrillin 2 (congenital con tracturai ara 




134168 


AA398908 


Hs.181634 


Homo sapiens cDNA: FU23602 fis, clone L 




134185 


AA285136 


Hs.301914 


neuronal specific transcription factor D 




134201 


L35035 


Hs.79886 


ribose 5-phosphate isomerase A (ribose 5 




134272 


X76040 


Hs.278614 


protease, serine, 15 


4.50 


134276 


BE083936 


Hs.80976 


antigen identified by monoclonal ant/bod 




134353 


AL138201 


Hs.82120 


nuclear receptor subfamily 4, group A, m 




134367 


A A Alft J A A 

AA339449 


Hs.82285 


phosphoribosylgtycin amide formyltransfer 


1 OA 

2.80 


134380 


AU077143 


Hs.179565 


rninichromosome maintenance deficient (S. 


X CO 

4.68 


134423 


H53497 


H 5.83008 


CG1-139 protein 




134469 


AA279661 


Hs.83753 


smalt nuclear ribonucteoprotein polypept 




134470 


X54942 


Hs.83758 


CDC28 protein kinase 2 




134498 


AWZ46273 


Hs.84131 


|L„,., J IfAKl A 4,., A IU 

threonyHRNA synthetase 




134502 


DC 1*4000** 


ns.oH i oo 






134510 


NM.002757 


Hs.250870 


mitogen-activated protein kinase kinase 




134548 


N95406 


Hs.333495 


Deleted in spfit-hand/spHMboi 1 regio 


6.00 


134654 


AK001741 


Hs.8739 


hypothetical protein FU 10879 



8.40 



9.80 



8.40 



3.82 



12.25 



7.00 



8.20 



8.60 
27.40 



6.60 



13.20 



9.20 
19.80 



15.83 



9.48 
12.00 



10.80 



12.50 



6.80 



6.11 

8.20 
24.60 



17.80 
14.00 

13.00 



8.40 
9.00 



14.74 



16.40 



4.36 



5.83 



3.87 



4.00 
8.96 
4.28 



4.63 
4.66 

4.55 

4.85 
6.34 

4.91 
4.60 
3.85 

4.08 

6.71 



13.60 



9.70 



3.84 
5.81 
4.21 
7.30 



4.63 
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134724 


AF045239 


Hs.321576 


ring finger protdn 22 




134743 


AA044163 


Hs.89463 


potassium large conductance catctunvscti 


4.00 


134781 


AA374372 


Hs.89626 


parathyroid hornrone-BKe hormone 




134806 


AO001528 


Hs.89718 


spermine synthase 




134853 


BE268326 


Hs.90280 


5-aminoMa2oJe4<arboxamid9 ribonucte 




134859 


026488 


Hs.90315 


K1AA0007 protein 




134891 


R51083 


Hs.9Q787 


ESTs 




134960 


BE246400 


Hs.285176 


acetyl-Coenzyme A transporter 


4.00 


134993 


BE409809 


Hs.301005 


purine-rich element binding protein B 




135047 


AL134197 


Hs.93597 


cycfin-dependent kinase 5, regulatory su 


9.50 


135080 


AI761180 


Hs.94211 


red! (required for cell differentiation, 


5.00 


135103 


NWL003428 


Hs.9450 


zinc finger protein 84 (HPF2) 




135145 


AW014729 


Hs.95262 


nuclear factor related to kappa B Nndin 




135184 


U13222 


Hs.96028 


forkhead box D1 




135242 


AI583187 


Hs.9700 


cycDnEI 


13.50 


135286 


AW023482 


Hs.97849 


ESTs 


6.46 


103409 


AW 61 iooy 


Ue Q7fiA 

nS.S/oo 


nypoineucai protein muv iuyz*» simuar 10 




135355 


AK001652 


Hs.99423 


ATP-dependent RNA helicase 


10.00 


135371 


NM.006025 


Hs.997 


protease, serine, 22 


&Q0 


135393 


L11244 


Hs.99886 


complement component 4-binding protein, 





12.00 



11.00 



8,80 



25.20 



6.20 
7.40 



7.00 



PCT/US02/12476 



4.58 
4.79 



4.48 



4.01 



14.60 



TABLE 5B shows the accession numbers for those primekeys lacking unigenelD's for Table 5A. For each probeset we have listed the gene duster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
'Accession' column. 

Pkey: Unique Eos probeset identifier number 
CAT number Gene cluster number 
Accession: Genbank accession numbers 



Pkey 

117079 
124305 
101502 
109792 
126034 
102768 
126345 
127066 
127099 
119243 
125875 
112054 
126979 
126992 
122318 



CAT number Accessions 



114793 
108305 
108393 
100867 
123731 
109700 
120715 
113702 
115113 
101045 
108554 
108573 
119052 
126522 
126605 
103766 



1621717J 
242183J 
18202.-6 
754958.1 
1598157J 
44641 1 
1653833J 
1703458.1 
244301.1 
1774795.1 
1566433.1 
1538292.1 
171411.1 
880S55J 
292419.1 
135322J 
15074^1 
111550.1 
113411.1 
tigr.HT4586 
genbarueAA609839 
genbank_F09669 F09609 
genbank_AA292700 
genbank_T97307 T97307 
genbankuAA256460 
entrez_J05614 J05614 
genbank^AA084948 - 
genbank_AA086005 
149538.1 
416020.1 



H92325 T971 25 

AW963221 AA344870 AA344871 H93331 
M26958 

R49625F10674 
H60340 N91637 
U82321 H66077 
N49713N49819W03810 
R25066 R201 44 R20145 Z43B45 
AA347668 AW956810 Z44271 F07065 F07064 R13506 
T12603T12604 
H14480 N98295 
R43590F10439 
AA210954AA211007 
AI809521 H12174Z42556 
AA429743AA442754 
AA127386R15644AA127404 
AA158245AA158235 
AA071391M069892AA069891 
AA075211 AA075245M075126AA074946 
U14622 

AA609839 



AA292700 
AA256460 



AA084948 
AA086005 
R10889R10888 
W31912AI167491 
439280.1 AA676910AA778853AA778865 W86800 
46922.1 W42667 AI580740 AI690440 AI561350 AW467906 AW1 51450 AI825927 AL041 716 AI8856Q0 AI74221 3 AW248624 A1955498 AA033947 

AA845593 AI62371 1 N6B5B3 C00064 AA193567 AW083868 AW163216 AA191595 AA522778 AI628008 AI915518 AA8435Q8 AI926195 
M176265AW167963AA992115 W93647AW103572AI862994AI342059M9im^^ A1591107 
AI199673 AI81 1766 AI275832 AI422233 AI191852 AJ096682 AI580124 AJ683612 AA582453AA927559AA485415T32414AJ084978 H44849 
H44848 H20477 T91 695 W47039 AA070055 AA024795 AA328855 AA379248 AA379330 AA385580 W25920 W03688 M448359 AA093881 
AW362477 AA089997 AJ350265 W93479 N99688 AA932257 AW351469 H68590 AA663402 AA069771 AW087986 A1858420 AA600214 
AJ970774 AI857712 AI683081 AI885584 AW131 150 AI567981 AW002714 AW1 89973 AW075495 AW1 68303 AA953714 AW516881 A1357375 
AI566663 AW512676 AI570580 A1023690 AA448216 AI079853 AI422707 AA779516 AW026972 AW130082 AW162307 AW438646 AA709332 
AW192394 AI167350 AI217879 AI1 291 52 AA719509 AI350480 AA66341 8 AI003634 AW1 1 8546 AA1 80261 AA442833 AI268625 AA888881 
A1038759 AA846723 AI248770 AA993694 AI280335 AI885107 AW518649 AA641563 AA995835 AA582521 AI276744AA436478 AI017360 
A1620763 AI859887 N73926 AI076327 A1741615 A1160617 AW1 72819 AI492005 AA677429 AA996334 A1693771 AI950039 AI245629 AI288515 
AI8661 86 T93293 AA1 73262 AA599779 AI680092 AW439316 AI084555 AI272672 AI583507 AW47321 9 AA738132 AW473283 AI367492 
AA995410 AI689624 AA206353 AI033095 AI040382 AA873630 AI221074 AI934840 Ai4 18680 AA844306 R94503 AA773520 AA843169 
AA219425 AA629658 A181 1719 AW41 1 275 AI590981 W37907 AI591 178 AI684051 AA983238 AA669347 AA976239 AA704570 AI628339 
. A1884391 AI241580 AI003539 AW1 76667 AA009650 N34566 Ai 333493 A) 186070 AA070827 AA411683 AI280884 AA872023 AA207255 
AA021576 N71953 AI885888 AW076039 T15777 AI537673 AW248048 H09554 W93480 W47001 AW0791 14 AA063160 M757453 R60788 
AI859431 H20478 AA21 8882 AA757465 AA 100995 AI864135 AI934209 AA070503 H47008 AA219546 W61 039 W93907 AW385050 W37967 
W78028 AA189007 AA479136 R93650 AA442312 T30287 AA847628 AA180262 AA009549 C03892 AW149464 AA310963 AA219693 
AA069747 R29207 AA094784 AA293615 AA447848 AI984167 N90393 C05097 N56499 AW292351 AW149681 AW473258 AA629322 AI004409 
AW105577 A1954937 AI811070 AA902422 AW514437 AA535460 AA916877 AW517122 AA974657 AA975649 AW517130 AW517129 F31737 
W07688 AA193645 AA378994 AA489273 F32267 W39303 AA021181 N86810AA406524AA062553AA436801 H08985 H15979 N4031O 



119 



WO 02/086443 PCTYUS02/12476 

AA43S789 AA232172 AW360778 W25862 R60282 AA436530 AA378894 AA 187461 AI940535 AA604210 M0895J4 AA360421 N88243 N84281 
AA209340 N561 74 N88374 AA1 91088 AW247691 AA24901 3 AA0931 1 1 AA972536 AW298594 AA375893 T1 21 39 W281 86 AW243849 
AJ288629 AA843996 W 15260 AI188288 AW248079 R15836 

119599 genbankJV45552 W45552 

112382 genbank_R59904 R59904 

105264 genbankJ\A227934 AA227934 

100071 entrezJ\2B102 A28102 

123315 714071J AA49S369 AA496646 

Table 6A shows 99 genes up-reguia&J nonsmokers with lung cancer relative to smokers wilh lung cancer. These genes were selected from 59580 probesets on the 
Eos/Affymetrix Hu03 Genechlp array. Gene expression data for each probeset obtained from this analysis was expressed as average Intensity (Al), a normalized value reflecting 
the relative level of mRNA expression. 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unlgene Title: Unigene gene title 

R1: average of Al for samples from non-smokers with adenocarcinoma divided by the 90th percentile of Al for samples from smokers with adenocarcinoma 

R2; average of Al for samples from non-smokers with squamous ceD carcinoma divided by the 90th percentile of A! for samples from smokers with squamous cell 



Pkey 


ExAccn 


UnigenelD 


Unigene Tide 


100971 


BE379727 


Hs.83213 


. fatty acid binding protein 4, adipocyte 


101174 


L17330 


Hs.280 


pre-T/NK celt associated protein 


101296 


Y12490 


Hs.85092 


thyroid hormone receptor interactor 1 1 


101304 


AA001021 


Hs.6685 


thyroid hormone receptor Interactor 8 


101806 


AA586894 


Hs.112408 


S100 calcium-binding protein A7 (psorias 


101972 


S82472 




gb:beta -poNDNA polymerase beta {axon a 


102274 


U30930 


Hs.158540 


UDP glycosyltransferase 8 (UDP-galactose 


102394 


NWL003816 


Hs.2442 


a disintegrin and metaitoproteinase doma 


102832 


U92015 




gb:Human clone 143789 defective mariner 


103010 


X52509 


Hs.161640 


tyrosine aminotransferase 


103439 


X98286 




gb:H.sapiens mRNA for Ggase like protel 


103563 


102911 


Hs.150402 


activln A receptor, type 1 


103857 


AI076795 • 


Hs.45033 


lacrimal proline rich protein 


104239 


AB002367 


Hs.21355 


doublecortin and CaM kinase-i/ke 1 


104590 


AW373062 


Hs.83623 


nuclear receptor subfamily 1, group I, m 


104907 


AA055829 


Hs.196701 


ESTs, Weakly similar to ALU1.HUMAN ALU 


106131 


BE514788 


Hs.296244 


SNARE protein 


106672 


H47233 


Hs.30643 


ESTs 


106872 


T56887 


Hs.18282 


KIAA1 134 protein 


106960 


AA156238 


Hs.32501 


ESTs 


106971 


Z43846 


Hs.194478 


Homo sapiens mRNA; cDNA DKFZp43401572 (f 


107982 


AA035375 


Hs.57887 


ESTs, Weakly similar to K1AA0758 protel 


108562 


AA100796 




gb.-zm26c06.s1 Stratagene pancreas (93720 


108599 


AB01B549 


Hs.69328 


MD-2 protein 


108663 


BE219231 


Hs.292653 


ESTs, Weakly similar to T26845 hypotheti 


109247 


AA314907 


Hs.85950 


ESTs 


109630 


R44607 


Hs.22672 


ESTs 


110193 


AI004874 


Hs.310764 


Homo sapiens mRNA; cDNA DKFZp434M082 (fr 


110234 


H24458 


Hs.32085 


EST 


110644 


R94207 


Hs.268989 


ESTs, Highly similar to type II CALM/AF1 


110886 


AW274992. 


Hs.72249 


three-PDZ containing protein similar to 


111057 


T79639 


Hs.14629 


ESTs 


111950 


AF071594 


Hs.1 10457 


Wolf-Hirschhom syndrome candidate 1 


112291 


R53972 


Hs.26026 


ESTs 


112956 


Z43784 


Hs.75893 


ankyrin 3, node of Ranvier (ankyrin G) 


113009 


T23699 


Hs.7246 


ESTs 


113060 


BE564162 


Hs.250820 


hypothetical protein FU14827 


113073 


N39342 


Hs.103042 


microtubule-associated protein 1B 


113074 


AK001335 


Hs.31137 


protein tyrosine phosphatase, receptor t 


113121 


T48011 


Hs.8764 


EST 


113125 


AA968672 


Hs.8929 


hypothetical protein FU11362 


113757 


AA703095 


Hs.18631 


ESTs 


113848 


W52854 


Hs.27099 


hypothetical protein FU23293 similar to 


113884 


AI333076 


Hs.28529 


chromosome 1 2 open reading frame 2 


113936 


W17056 


Hs.83623 


nuclear receptor subfamily 1, group I, m 


114875 


AA235609 


Hs.236443 


Homo sapiens mRNA; cDNA DKFZp564N1063 ( 


114987 


AA251016 


Hs.878Q8 


EST 


115460 


AW958439 


Hs.38613 


ESTs 


115722 


W91692 


Hs.59509 


ESTs 


116261 


AA481788 


Hs.190150 


ESTs 


116830 


H61037 


Hs.70404 


ESTs. Weakly simnar to ALU2_HUMAN ALU 


116970 


AB023179 


Hs.9059 


KIAA0962 protein 


117178 


H98675 


Hs.269034 


ESTs 


117757 


AF088019 


Hs.46732 


EST 


118283 


AA287747 


Hs.173012 


ESTs. Weakly similar to A46010 X-llnked 


118384 


AF217525 


Hs.49002 


Down syndrome cell adhesion molecule 


118657 


A1822106 


Hs.49902 


ESTs 


120328 


AA923278 


Hs.290905 


ESTs, Weakly simQar to protease [Hsapi 


120404 


AB023230 


Hs.96427 


KlAAf 013 protein 


120524 


AA261852 


Hs.192905 


ESTs 


120688 


AW207555 


Hs.97093 


Homo sapiens cDNA: FU23004 fis, done L 



R1 

15.00 



7.50 
7.50 
1350 
9.50 

9.00 

13.50 

16.50 

7.00 
11.50 

9.50 

16.50 
13.00 

7.00 

12.50 

16.50 

8.00 

17.00 

16.50 

11.00 



9.79 
32.50 



19.50 
6.00 



9.50 

aso 

7.50 

7.50 
16.50 



7.00 
6.00 
17.92 



R2 

3.64 

246 
12.00 
268 
211 



250 
3.94 
12.66 
2.17 

£38 
2.95 

140 
5.00 



3.00 
2.79 
4,50 



3.82 
221 

2.65 

6.00 
4.63 
7.00 
6.00 
2.27 
9.00 



2.68 



2.50 
2.39 
3.50 
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WO 02/086443 




121558 


AA412497 




gbzt95g12s1 SoaresJesfeJfHT Homo sap 


121676 


H56037 


Hs.108146 


ESTs 


121936 


AJ024600 


Hs.98612 


ESTs 


121938 


AA428659 


Hs.98610 


ESTs 


122177 


AA435789 


Hs.98833 


EST 


123442 


AA299652 


Hs.1 11496 


Homo sapiens cDNA FU11643 fis, done HE 


123551 


AA608837 




gb:af03h12.s1 Soaras_testis_NHT Homo sap 


123756 


AA60S971 


Hs.1 12795 


EST 


123861 


AA620840 




gb:af89g01.s1 SoaresJestisJIHT Homo sap 


124371 


N24924 


Hs.188601 


ESTs 


127477 


BE328720 


Hs.280651 


ESTs 


127591 


AI190540 


Hs.131092 


ESTs 


128252 


AA455924 


Hs.1 92228 


ESTs 


128426 


A1265784 


Hs.145197 


ESTs 


128925 


R67419 


Hs.21851 


Homo sapiens cDNAFU 12900 fis, clone NT 


128945 


AI990506 


Hs.8077 


Homo sapiens mRNA; cDNA DKFZp547E184 (fr 


129105 


AI769160 


Hs.1 08681 


Homo sapiens brain tumor associated prat 


129235 


AW977238 


Hs.126084 


KIAA1055 protein 


129506 


AB020664 


Hs.1 1217 


K1AA0877 protein 


129595 


U09550 


Hs.1154 


oviducta! glycoprotein 1 , 120kD (mucin 9 


130160 


AA305588 


Hs.267695 


UDP-Gal:betaG!cNAc beta 1,3-galactosyltr 


130340 


D82326 


Hs.239106 


solute carrier family 3 (cystine, dibasi 


131220 


AB023194 


Hs.300855 


WAA0977 protein 


131430 


AI879148 


Hs.26770 


fatty add binding protein 7, brain 


132114 


NM-006152 


Hs.40202 


lymphoid-restricted membrane protein 


132458 


AA935315 


Hs.48965 


Homo sapiens cDNA: RJ21693 fis, clone C 


132647 


NNL008927 


Hs.54432 


sialyltransferase 4B (beta-galactosidase 


132655 


D49372 


Hs.54460 


small inducible cytokine subfamily A (Cy 


132682 


A1077500 


Hs.54900 


serologically defined colon cancer antig 


132747 


AA345241 


Hs.55950 


ESTs, Weakly similar to KIAA1330 protein 


132812 


R50333 


Hs.92186 


Leman coiJed-coi) protein 


133337 


Af 085983 


Hs.293676 


ESTs 


133876 


All 34906 


Hs.771 


phosphorylase, glycogen; liver (Hers dis 


134119 


AW157837 


Hs.79226 


fasciculation and elongation protein zet 


134464 


AA302983 


Hs.239720 


CCR4-NOT transcription complex, subunH 


134542 


M14156 


Hs.85112 


insulin-like growth factor 1 (somatomedi 


135002 


AA448542 


Hs.251677 


G antigen 7B 


135305 


AA203555 


Hs.98268 


Homo sapiens cDNA FU14903 fis, clone PL 



PCTAJS02/12476 



295 



10.00 

15.00 

14.00 

8.93 

13.04 

11.50 

11.00 

6.50 



7.0) 



10.00 
15.50 

6.50 

20.00 
11.50 
17.50 
6.10 



7.50 



87.00 



2.50 

4.33 
3.02 

2.08 
211 



4.25 
10.00 



6.15 
5.58 

2.53 
2.50 
2.83 
3.82 
5.00 
3.00 
206 
2.27 
11.50 

6.50 



TABLE 6B show the accession numbers for those primekeys lacking unigenelD's for Table 6A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubieTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession" column. 

Pkey: Unique Eos probeset identifier number 
CAT number. Gene cluster number 
Accesshn: Genbank accession numbers 



Pkey 

108562 
103439 
123551 
123861 
102832 
101972 
121558 



CAT number Accessions 

36375 1 AA100796 AF020589 AA074629 AA075946 AA1QQ849 AA085347 AA126309 AA07931 1 AA079323 AA085274 

35330J X98266 N41124 

genbankjAA608837 AA608837 

genbank_AA620840 AA620B40 

entrez_U92015 U92015 

entrez_S82472 S82472 

genbanl^AA412497 AA412497 



121 



WO 02/086443 PCT/US02/12476 

Table 7A shows 98 genes down-regulated in non-smokers with lung cancer relative to smokers with lung cancer. These genes were selected from 53680 probesets on the 
Eos/Affymstrix Hu03 Genechip array. Gene expression data for each probeset obtained from this analyse was expressed as average intensity (A!), a normalized value reflecting 
the relative level of mRNA expression. 

Pkey: Unique Eos probeset identifier number 

ExAccrv Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene 7111a Unigene gene title 

R1 : 90th percentile of Al for samples from smokers with adenocarcinoma divided by the average of Al for samples from non-smokers with adenocarcinoma 

R2 90th percentile of Al for samples from smokers with squamous ceQ carcinoma divided by the average of Al for samples from non-smokers with squamous cell 
carcinoma. 



Pkey 


ExAccn 


1 biutanain 

urugeneiu 


Unigene Title 


pi 


pi 


401107 
lUUlor 


047701 


Uf> 704 01 


atdo-keto reductase family 1 , member C3 




104. 1U 


4 AA1 Oft 




Un 40CC4 

ns.loooi 


neuroblastoma (nerve tissue} protein 




77 /(ft 


4AAC7C 

lOOoTb 




Up 17ACQ 


caicitonin/caici ion tn-r elated pojypepuo 


4 no A(\ 




4AA074 

101)9/1 


DC170717 


Un 01041 

Hs.oJ21o 


tatty acid binding protein 4, adipocyte 


AC'i on 




4n4AAG 

10 1040 


l/f\4 4CA 

K01 160 




(NONcJ 


672.00 




4A4ncR 


AIA/07A1CJI 


Un QQQ 


Qierot-Leyden crystal protein 


Rft nn 




4A4 47K 

1011/0 


1 IQ1G74 


Up iroqa 
nS.oDSoO 


melanoma antigen, family A, 2 




77 OA 

r r.4.u 


4 A4 jt07 

101497 


W05150 


U- 17A1J 

Hs.37034 


homeo box A5 


tto on 

02.00 




101663 


NMJJ0352O 


HS.217B 


H2B rustone tamiiy, member u 


70. nn 

ro.UU 




101677 


ftiu AnA74r 

NM_000715 


Hs.1012 


complement component 4-binding protein, 


186.20 




4A47JC 

101745 


MBoVOO 


HS. 150403 


dops decarboxylase (aromaSc L«amino ac) 


an oq 




4 A40ni 
1U1S41 


077EQ1 




gD.ncnVKiu/nUMiYi i v reverse uanscnpiase 


no on 




102125 


NM__Q06456 


ll_ 1Q004 C 

Hs.288215 


sialyttransferase 




4A1 4A 
103.10 


4A01^1 

102242 


1 1071 DC 

U 27 185 


Un 01CJ7 

HS.62547 


retinolc acid receptor responder (tazaro 


C7 Aft 
Of MM 




4 All/ A 


1 I17ACC 


Up 17CCC7 
US.Cf 000/ 


macrophage stimulating 1 (hepatocyte gro 


71 fift 




4A11CO 


U39840 


Un 1QOOC7 

HS.299oor 


hepatocyte nuclear factor 3, alpha 




RO 70 


4ft1/1C7 


MM ftA41QX 


Up 11 CO 


dual specificity phosphatase 4 


4 CI Aft 






1 174 Oft? 

U71207 


Un 1017Q 

Hs.29279 


eyes absent (Drosophila) homolog 2 




rr 7n 
oo. ru 


4 /wind 

102796 


A1079646 


Un 4A7A4A 

Hs.107019 


symplekin; HunHngtin interacting protei 




co on 
Oo.oU 


102629 


NMJJQ61 63 


riS.8Uyb2 


neurotensin 




IRfl RO 


4A11A7 

103207 






gb:Human endogenous retrovirus mRNA for 


7A AA 




4 A11/1 

103242 


V7C1JI1 

X76342 


Up 1QD. 

HS.369 


alcohol dehydrogenase 7 (class IV), mu o 






4 mica 
103200 


V7Q/HC 
A/0*»10 


Up 14CC 


casein, alpha 






103351 


X8921 1 




gb:H.sapiens DNA for endogenous retrovir 


RA CA 

OH.bU 




104212 


AB002298 


Hs.173035 


!/, A AMAA _»^tnl^ 

KIAA0300 protein 


ec oa 




104252 


AF002246 


HS. 2 10863 


cell adhesion molecule with homology to 


63.80 




104258 


AF007216 


Hs.5462 


solute carrier family 4, sodium bicarbon 


94.40 




105024 


A it 4 lm<4 

AA1 26311 


HS.9879 


ESTs 


68.20 




106200 


a inn 7 < J A 

AJU97144 


Un coca 

HS.5250 


ETCTn UJUnL.li * nlnntlnr U A|||1 Ul II JAM Al 1 1 O 

ESTs, Weakly similar to ALU inhuman ALU b 




7vl CA 


106440 


AA449563 


m_ 4C4 oni 

Hs. 15 1393 


glutamate-cystelne ligase, catalytic sub 




74 4ft 
M.1U 


10 DODO 


DC10Q04 A 




p.K<£ft4 4 40A4CC4 MIU ItMf^f^ 47 Unmn p.n'mnr n 

gD.oui 1 loOior 1 Nin_ivioU_i f nomo sapiens c 


71 1ft 

r«).2U 




105605 


AVV77229o 


I l_ nft inn 

HS.21 103 


Homo sapiens mRNA,' cDNA DKFZp564B076 (rr 


83.80 




106614 


AA648459 


Hs.335951 


hypothetical protein Af 301222 




62.30 


106654 


AW075485 


Hs.286049 


phosphoserine aminotransferase 




20Z40 


106999 


H93281 


Hs.10710 


hypothetical protein FU20417 




89.60 


108700 


AA121518 


Hs.193540 


ESTs, Moderately similar to 2109260A B c 




66.40 


108810 


AW295647 


Hs.71 331 


hypothetical protein MGC5350 




95.50 


108857 


AI/AA4 JffO 

AK001458 


1 ]_ £14 Oft 

Hs.62180 


aniltin (Drosophila Scraps homolog), act 




a An 
63.40 


109597 


AA989362 


Hs, 293780 


ESTs 


oc An 
oo.OO 




109691 


T65568 


Hs. 12860 


ESTs 




58.70 


109704 


AI743B50 


HS, 12876 


ESTs 




CA CA 

60.60 


110942 


R63503 


Hs.28419 


ESTs 


76.40 




111722 


R23924 


Hs.23596 


EST 


7n CA 

74.60 




112891 


T039Z7 


Un 1A14JI7? 

nS.293147 


t oi s, Moderately similar to A46010 a*u 


CA OA 

04.80 




112992. 


AL1 57425 


Un 4 OIHC 

HS.133315 


Homo sapiens mRNA; cuna DKFZp7oUl324 tt 




7C 7A 

76.70 


113073 


ft lorn 4 n. 

N39342 


11— 4A4AJ1 

Hs. 103042 


microtubule-associated protein 1B 




120.20 


114251 


H15261 


Hs.21948 


ESTs 


417 OA 

127.20 




115230 


A A 17Q*IAA 

AA27BJOO 


Up 4 OilOQI 

HS. 124292 


11 _f-Sft| A . CI 1114 1*3 fin ntnna 1 

Homo sapiens cuna. ruizoi 20 tis, ctone l 


47i4 AA 

1/4.00 




115291 


BE545072 


Hs.1 22579 


hypothetical protein FU 10461 




A4 AA 
91.00 


115815 


AW90532B 


HS. 180842 


ribosoma) protein L13 


DO.40 






A\A/Q71E17 

AWo72o2f 


• U» CQ7C4 

Hs.oy/bi 


COT- Ufnnt/tti nimilnr In HAD4 Ul IMAM HCATU 

to is, weawy similar 10 UAr i_human utAi m 




oiR fin 


115965 


AA001732 


Hs. 173233 


hypothetical protein FU 10970 


82.80 




41fiin7 


Al 41101G 


Up 4 71C71 

nS,i /20/2 


nypouieticai protein rLjzuuyo 




iri Rn 


116552 


D20508 


Un 4C4C4A 

Hs.164649 


hypothetical protein DKFZp434H247 


CA AA 

69.00 




116571 


n/ceco 




_L,_tl IIIOCAIOitO UttMm- mJlttt l.inn 1> tltm*! 

go:HUMGo02o4o Human adult lung s direct 


CA 1A 

64.20 




118466 


N 66741 




ah*VT33aflS s1 Mnrtnn Fetal Cochlea Homo 




63.50 


120484 


AA253170 


Hs.96473 


EST 


81.60 




120983 


AA398209 


Hs.97587 


EST 




81.10 


121034 


AL389951 


Hs.271623 


nucteoporin 50kO 




66.20 


121423 


AW973352 


Hs.290585 


ESTs 


64.40 




122553 


AA451884 


Hs.190121 


ESTs 




60.40 


122946 


A1718702 


Hs.308026 


major histocompatibBity complex, class 


188.60 




123130 


AA487200 




gb:ab19f02.s1 Stratagene lung (937210) H 




80.20 


124472 


N52517 


Hs.102670 


EST 


71.00 




124526 


N62095 


HS33185 


ESTs, WeaWy simflar to JC7328 amino aci 




104.90 


125469 


H49193 


Hs.124984 


ESTs, Moderately similar to ALU7_HUMAN A 




72.00 


125731 


R61771 


Hs.26912 


ESTs 




69.90 


125747 


NAL002884 


Hs.865 


RAP1A, member of RAS oncogene famBy 


69.00 




126020 


H79B53 


Hs.1 14243 


ESTs 




62.40 


126547 


U47732 


Hs.84072 


transmembrane 4 superfamOy member 3 




62.80 


126966 


R38438 


Hs.182575 


solute carrier family 15 (H+/peptide tra 




60.10 
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127472 


AA761378 


Hs.192013 


ESTs 


70.20 


127610 


AA960867 


Hs.150271 


ESTs, Highly similar to unnamed protein 


64.00 


127742 


AW293496 


Hs.180138 


ESTs 


85.20 


127987 


M022103 


Hs.124511 


ESTs 


96.60 


128233 


AW889132 


Hs.11916 


riboklnase 




128420 


AA650274 


Hs.41296 


Rbronectin leucine rich transmembrane p 




128766 


AW160432 


Hs.296460 


craniofacial development protein 1 


66.80 


129014 


AW935187 


Hs.170162 


K1AA1357 protein 




129215 


AB040930 


Hs.126085 


WAA1497 protein 


64.20 


130090 


H97878 


Hs.132390 


zinc finger protein 36 (KOX 18} 


63.80 


130385 


AW067800 


Hs. 155223 


start nice at cin 2 




130732 


AW890487 


Hs.63984 


cadherin 13. H-cadherin (heart) 




131025 


AB040900 


Hs.6189 


K1AA1467 protein 


64.40 


131241 


BE501914 


Hs.24654 


Homo sapiens cONA FU11640 fis, clone HE 


76.20 


131775 


AB014548 


Hs.31921 


K1AA0548 protein 


97.80 


132240 


AB018324 


Hs.42676 


K1AA0781 protein 




132856 


NMJ01448 


Hs.58367 


gfypican 4 


133.20 


132977 


AA093322 


Hs.301404 


RNA binding motif protein 3 


133749 


L20852 


Hs.10018 


solute carrier family 20 (phosphate tran 




133818 


AI110684 


Hs.7645 


fibrinogen, B beta polypeptide 


341.00 


134264 


AF149297 


Hs.8087 


NAG-5 protein 




134265 


M83772 


Hs.80876 


flavin containing monooxygenase 3 




134346 


X84002 


Hs.82037 


TATA box binding protein (TBP)-associate 


66.00 


134395 


AA456539 


Hs.8262 


lysosomal-associatfid membrane protein 2 




135047 


AL134197 


Hs.93597 


cyclin-dependent kinase 5, regulatory su 


71.40 


135056 


N75765 


Hs.93765 


lipoma HMGIC fusion partner 


135309 


AI564123 


Hs.42500 


ADP-ribosylation factor-like 5 


70.40 



78.90 
10a90 

58.53 
139.60 



71.00 
88.40 

59.30 

64.30 
23153 

75.80 
108.30 



TABLE 7B shows the accession numbers for those primekeys lacking unigenelD's for Table 7A. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were cfustered based on sequence 
similarity using Clustering and Alignment Toots (OoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
'Accession" column. 

Pkey: Unique Eos probeset identifier number 
CAT number Gene cluster number 
Accession: Genbank accession numbers 



Pkey 

103207 
106566 

116571 

118466- 

101046 

101941 

103351 

123130 



CAT number Accessions 
30635_-4 X72790 

120358 1 BE298210 AI672315 AW086489 BE298417 AA455921 AA902537 BE327124 R14963 AA085210 AW274273 A1333584 AI369742 AI039658 
A1885095 AI476470 AI2B7650 AI885299 AI985381 AW592624 AW340136 AI266556 M456390 AI310815 AA484951 



genbank_D45652 
genbank_N66741 
entrezJ<01160K01160 
entrez_S77583 S77583 
entrez_X8921 1X89211 
genbank_AA487200 



D45652 
N66741 



AA487200 
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Table 8A straws 1720 genes either up or down-regulated in lung tumors or chronically diseased lung relative to a broad collection of over 40 distinct norma! body tissues. 
CruonicaDy diseased lung samples represent chronic non-maBgnaiit lung diseases such as fibrosis, emphysema, and bronchitis. These genes were selected from 39494 
probesets on the Eos/Affymetrix Hu02 Genechip array. Gene expression data for each probeset obtained from this analysts was expressed as average intensity (At), a 
normalized value reflecting the relative level of mRNA expression. 

Pkey: Unique Eos probeset identifier number 

ExAccnc Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

Rl: 70th percentile of Al for lung tumors divided by 90th percentile of Al for normal lung 

R2: 70th percentile of Af for chronically diseased lung <Mlsd by 90th percentile of Al for normal lung 



Pkey 


ExAccn 


UnigenelD 


Unigene Tide 


R1 


R2 


300097 


A1916973 


Hs.213603 


ESTs 


5.46 


4.69 


300117 


AW189787 


Hs.147474 


ESTs 


0.58 


0.56 


300197 


A1686661 


Hs.218286 


ESTs 


4.26 


5.44 


300201 


AI308300 




gb:ta90c06jt1 NCI_CGAP_Bm20 Homo saplen 


0.62 


0.83 
1.75 


300225 


AI989963 


Hs.197505 


ESTs 


1.68 


300247 


AW274682 


Hs.161394 


ESTs 


1.08 


2.28 


300256 


A1469095 


Hs.298241 


Transmembrane protease, serine 3 


0.86 


1.00 


300337 


A1707881 


Hs.202090 


ESTs 


5.80 


9.09 


300362 


Z42308 




gb:HSC0FB121 normalized infant brain cDN 


4.18 


12.78 


300374 


A1859947 


Hs.314158 


ESTs 


2.99 


4.38 


300387 


AW270150 


Hs.254516 


ESTs 


1.50 


2.53 


300440 


A1421541 


Hs.146164 


ESTs 


3.98 


5.25 


300441 


R10367 


Hs.307921 


EST, Weakly similar to Z232.HUMAN ZINC F 


3.18 


6.60 


300449 


AI362967 


Hs.132221 


hypothetical protein FU12401 


0.43 


0.62 


300469 


AW135830 


Hs.233955 


hypothetical protein FU2Q401 


0.16 


0.83 


300552 


X85711 


Hs.21838 


hypothetical protein FU1 1191 


4.10 


9.75 


300627 


W27363 




gb:ab37d01 j-1 Stratagene HeLa cell s3 93 


4.60 


12.60 


300630 


AW118822 


Hs.128757 


ESTs 


2.91 


5.86 


300716 


A1216113 


Hs.1 26280 


hypothetical protein FU23393 


1.00 


0.92 


300738 


AI623332 


Hs.1 30541 


KIAA1542 protein 


1.82 


1.71 


300777 


AA235361 


Hs.96840 


K1AA1527 protein 


4.48 


8.22 


300790 


AI492471 


Ks.188270 


ESTs 


1.29 


1.18 


300832 


AI688147 


Hs.220515 


ESTs, Weakly similar to T03829 transcrip 


5.51 


8.56 


300836 


Z44942 


Hs.22958 


calcium channel alpha2-de!ta3 subunlt 


4.90 


6.34 


300838 


AI582897 


Hs.192570 


hypothetical protein FU22028 


1.70 


2.81 


300878 


AW449802 


Hs.285901 


Homo sapiens cDNA FU20428 fis, clone KA 


4.56 


7.91 


300897 


A1890356 


Hs.127804 


ESTs, Weakly similar to T1 7233 hypothefl 


2.23 


1.58 


300926 


AA504860 




gb:ab03a10.s1 Stratagene fetal retina 93 


2.13 


3.50 


300960 


AI041019 


Hs.152454* 


ESTs 


2.74 


4.46 


300961 


AW204069 


Hs.312716 


ESTs, Weakly similar to unnamed protein 


1.00 


1.00 


300952 


AA593373 


Hs.293744 


ESTs 


1.46 


1.51 


300967 


AA565209 


Hs.269439 


ESTs 


0.39 


1.30 


300987 


AW450840 


Hs.148590 


ESTs. Weakly similar to AF208846 1 BM-00 


1.49 


1.08 


300988 


A1927208 


Hs.208952 


ESTs 


0.16 


0.37 


301050 


AW136973 


Hs.288516 


ESTs, Weakly similar to S69890 mitogen I 


3.23 


1.94 


301098 


AA677570 


Hs.185918 


ESTs 


6.76 


14.28 


301157 


AA729905 


Hs.231916 


ESTs 


3.16 


8.85 


301162 


AI142118 


Ks.129004 


ESTs 


1.68 


7.18 


301170 


AA737594 


Hs.247606 


ESTs 


4.40 


6.42 


301192 


AI808751 


Hs.121188 


ESTs 


6.38 


11.59 


301193 


AA758115 


Hs.128350 


ESTs, Weakly similar to JC5423 2-hydroxy 


4.35 


7.78 


301267 


AW297762 


Hs.255690 


ESTs 


1.56 


1.61 


301281 


AA843986 


Hs.190586 


ESTs 


2.19 


1.78 


301341 


AI819198 


Hs.208229 


ESTs 


0.76 


0.76 


301382 


AA912839 


Hs.163369 


ESTs 


1.00 


1.81 


301407 


AW450466 


Hs.126830 


ESTs 


1.48 


1.51 


301452 


AA975688 


Hs.1 59955 


ESTs 


0.51 


1.46 


301483 


AW272467 


Hs.254655 


Untitled 


2.40 


5.02 


301494 


AI678034 


Hs.131099 


ESTs 


2.79 


3.41 


301521 


A1733621 


Hs.133011 


zinc finger protein 1 17 (HPF9) 


0.67 


0.67 


301531 


AI077462 


Hs.134084 


ESTs 


2.52 


3.76 


301580 


A1878959 


Hs.73737 


splicing factor, arglnine/serine-rich 1 


7.41 


11.92 


301676 


Z43570 


Hs.27453 


ESTs, Moderately similar to G01251 Rar p 


8.31 


10.70 


301690 


F05865 


Hs.108323 


ubiquitin-conjugating enzyme E2E 2 (homo 


2.70 


4.22 


301718 


F07744 


Hs.7987 


DKFZP434F162 protein 


4.20 


8.78 


301799 


AA384252 


Hs.286132 


D15F37{pseudogene) 


5.93 


7,04 


301804 


AA581004 


Hs.62180 


anHlin (Drosophila Scraps homolog), act 


1.70 


0.76 


301822 


X17033 


Hs.271986 


Integrin, alpha 2 (C049B, alpha 2 subuni 


1.58 


1.36 


301846 


R20002 


Ks.6823 


hypothetical protein FU10430 


1.00 


1.00 


301868 


T71508 


Hs.1 3861 


ESTs, Weakly similar to pH sensitive max 


Z88 


5.49 


301882 


T78054 




gb:yc97g09.r1 Soares infant brain 1NIBH 


2.28 


3.80 


301905 


AI991 127 


Hs.1 17202 


ESTs 


1.00 


1.00 


301948 


AA344647 


Hs.1 16724 


aidc-teto reductase family 1, member B11 


5.28 


2.28 


301960 


AW070252 


Hs.27973 


WAAQ874 protein 


5.38 


6.48 


30»)11 


T91418 


Hs.125156 


transcriptional adaptor 2 (ADA2, yeast, 


3.03 


3.42 


302016 


N40834 


Hs.23495 


hypothetical protein RJ1 1 252 


1.00 


1.25 


302041 


NMJJ01501 


Hs.1 29715 


gonadotropin-releaslng hormone 2 


0.71 


0.99 


302072 


AJ238381 


Hs.1 32576 


paired box gene 9 


1.60 


1.71 


302094 


A1286176 


Hs.6786 


ESTs 


0.52 


1.20 


302095 


AW044300 


Hs.137506 


Homo sapiens BACclone RP11-120J2 from 7 


Z7S 


4.93 


302148 


AW269618 


Hs.23244 


ESTs 


3.04 


3.87 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



302155 
302201 
302202 
302206 
302209 
302235 
302290 
302328 
302346 
302360 
302384 
302406 
302409 
302423 
302432 
302435 
302437 
302455 
302472 
302476 
302489 
302490 
302562 
302566 
302630 
302634 
302638 
302647 
302655 
302656 
302668 
302679 
302680 
302697 
302705 
302711 
302719 
302742 
302755 
302771 
302789 
302795 



302803 
302812 
302847 
302885 
302943 
302977 
303006 
303011 
303013 
303061 
303077 
303090 
303091 
303094 
303095 
303131 
303195 
303196 
303216 
303222 
303234 
303251 
303295 
303297 
303316 
303467 
303506 
303552 
303598 
303637 
303655 
303756 
303856 
303893 
303907 
303946 
303978 
303981 
303990 
303998 



AI088485 

AJ006276 

ARJ97159 

AI937193 

AF047445 

AL049987 

AL1 17607 

AA354849 

AL039101 

AJ010901 

Y08982 

U86751 

AF155156 

AB028977 

ALO8O068 

AF092047 

AB024730 

AA356923 

AA317451 

AF 182294 

T80660 

AA885502 

AJ005585 

AA085996 

AB029488 

AB032953 

AA463798 

X57723 

AJ227892 

AW293005 

AA560691 

H65022 

AW192334 

AJ001408 

U09060 

L0S442 

W69724 

L12069 

AW384815 

H98476 

AJ245067 

AJ245313 

Y08250 

AA442824 

N31301 

X98940 

AL137763 

AI581344 

AW263124 

AF078950 

AF090405 

F07898 

AF151882 

AF163305 

AA443259 

AF192913 

AF195513 

AF202051 

AW081061 

AA082211 

AA082298 

AA581439 

AA333538 

AA132255 

AW340037 

AA205625 

T80072 

AF033122 

AA398801 

AA340505 

AA359799 

AA382814 

AF056083 

AA504702 

AI738488 



Hs.144759 

Hs.159003 

Hs.159140 

Hs.41143 

Hs.159297 

Hs.166361 

Hs.175563 

Hs.23240 

Hs.194625 

Hs.198267 

Hs.202676 

Hs.211956 

Ks.218028 

Hs.225974 

Hs.272534 

Hs.227277 

Hs.227473 

Hs.240770 

Hs.6335 ■ 

Hs.241578 

Hs.230424 

Hs.167032 

Ks.48956 

Hs.248572 

Hs.272100 

Hs.173560 

Hs.102696 

Hs.198273 

Hs.146274 

Hs.70704 

Hs.180789 

Hs.38216 



ESTs 

transient receptor potential channel 6 
UDP-Ga):betaG!cNAc beta 1.4- galactosyl! 



Hs.149208 
Hs.42522 

Hs.272838 

Hs.293961 
Hs.152664 

Hs.132127 
Hs.127812 
Hs.315111 
Hs.24139 



304006 



N88597 

AW467774 

AW474196 

AW513315 

AW513804 

AW515465 

AW516449 

AW516611 

AW517947 



Hs.27693 

Hs.146286 

Hs.130683 

Hs.278953 

Hs.134079 

Hs.103180 

Hs.233936 

Hs.59710 

Hs.152328 

Hs.204501 

Hs.143951 

Hs.1 15897 

Hs.208067 

Hs.13423 

Hs.14125 

Hs.323397 

Hs.105887 

Hs.224662 

Hs.24879 

Hs.258802 

Hs.1 15838 

Hs.180532 

Hs.1 13503 

Hs.171880 

Hs.306637 

Hs.278834 



killer cefl tecfin-Bke receptor subfarni 
Homo sapiens mRNA; cDNA DKF2p564F1 12 (fir 
Homo sapiens mRNA; cDNA DKFZp564M0763 (f 
Homo sapiens cDNA FU13496 fis, done PL 
dynein, cytoplasmic tight intermediate 
mucin 4, tracheobronchial 
synaptonema) complex protein 2 
CDtepsilon-associated protein; anQsens 
adaptor-related protein complex 4, epsB 
K1AA1054 protein 

Homo sapiens mRNA; cDNA DKFZp564J062 (fr 
sine ocuRs homeobox (Drosophila) homob 
UDP-N-acety tg!ucosamine:a-1 ,3-D-mannosld 
nuclear cap binding protein subunit 2, 2 
SWl/SNF related, matrix associated, act) 
U6 snRNA-associated SnviBo protein LSm8 
Homo sapiens cDNA FU13540 fis, clone PL 
ESTs 

gap junction protein, beta 6 (connexin 3 
hypothetical protein FU22965 
SMS3 protein 

odd Oz/ten-m homolog 2 (Drosophila, mous 
MCT-1 protein 

NADH dehydrogenase (ubiquinone) 1 betas 
ESTs 

Homo sapiens, clone IMAGE:2B23731, mRNA, 
S164 protein 

gb:yu66g11.r1 Weizmann Olfactory Epithet 
ESTs 

gb:Homo sapiens mRNA for Immunoglobulin 

gb:Human immunoglobulin heavy chain, V-r 

gfcHuman autonomously replicating sequen 

hypothetical protein FU20920 

gb:Homo sapiens (done WR4.1QVH) antMh 

KIAA1555 protein 

ESTs 

gb:Homo sapiens mRNA for immunoglobulin 
hypothetical protein FU 10494 
gb:H.sapiens mRNA for variable region of 
ESTs, Moderately similar b putative DNA 
hypothetical protein FU20051 



hypothetical protein LOC57822 
ESTs, Weakly similar to T17330 hypotheti 
hypothetical protein FU 12894 
Homo sapiens cDNA: FU23137 fis, clone L 
gb:Homo sapiens clone 2A1 scFV anitbody 
RAB22A, member RAS oncogene family 
peplidylprolyl Isomerase (cydopNGnH 
gb:H.sapiens T-cel) receptor mRNA 
klnesin family member 13A 
zinc finger protein 180 (HHZ168) 
Pur-gamma 
NM23-H8 
OC2 protein 

myosin, light polypeptide, regulatory, n 

ESTs 

ESTs 

hypothetical protein FU10534 
ESTs 

protocadherin 12 . 
ESTs 

Homo sapiens clone 24468 mRNA sequence 
p53 regulated PA26 nuclear protein 
ESTs 

ESTs, Weakly similar to Homolog of rat 2 
ESTs, Weakly similar to unnamed protein 
gb:EST96097 Testis I Homo sapiens cDNA 5 
phosphatidic acid phosphatase type 2C 
ATPase, (Na+)/K+ transporting, beta 4 po 
ESTs 

glucose phosphate isomerase 
karyopherin (importin) beta 3 
polymerase (RMA) (I (DNA directed) pofyp 
Homo sapiens cONA FU 12363 fis, clone MA 
gb3co43c12j(1 NCLCGAPJJH Homo sapiens 
ESTs, Weakly similar to ALU1_HUMAN ALU S 
gb3cu71a11a1 Na.CGAP_Kid8 Homo sapiens 
gb3d68f05 Jt1 NCI_CGAP_Ut2 Homo sapiens 
gb:xp70b11.x1 NCI.CGAP.Ov39 Homo sapiens 
gb-jct66h02j(1 NCLCGAPJJ12 Homo sapiens 
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304008 


AW518198 


Hs.3297 


304009 


AW518206 


Hs.181165 


304024 


T03036 




304026. 


T03160 




304028 


T03266 




304036 


T16855 


Hs,244621 


304046 


T54803 




304061 


T61521 




304063 


T62536 




304097 


R25376 


Hs.1 77592 


304114 


R78946 




304122 


H28966 




304155 


H68696 




304203 


N56929 




304234 


W81608 




304267 


AA064862 


Hs.73742 


304270 


AA069711 


Hs.297753 


304287 


AA079286 


Hs.78466 


304348 


AA179868 




304415 


AA290747 


Hs.1 69476 


304430 


AA347682 




304456 


AA411240 




304521 


AA464716 




304526 


AA476427 




304542 


AA482602 


Hs.169476 


304546 


AA486074 


Hs.297681 


304607 


AA513322 




304640 


AA524440 


Hs.111334 


304650 


AA527489 


Hs.3463 


304735 


AA576453 




304760 


AA580401 




304849 


AA588157 


Hs.13801 


304917 


AA602685 


Hs.284136 


304921 


AA603092 


Hs.297753 


304966 


AA613893 


Hs.282435 


304987 


AA61B044 


Hs.300697 


305016 


AA626876 




305034 


AA630128 




305072 


AA641012 




305111 


AA644187 


Hs.303405 


305148 


AA654070 




305159 


AA659166 


Hs.275668 


305190 


AA665955 




305232 


AA670052 


Hs.169476 


305235 


AA670480 




305245 


AA676695 


Hs.81328 


305312 


AA700201 




305322 


AA701597 


Hs.163019 


305394 


AA720942 


Hs.300697 


305413 


AA724659 




305447 


AA737856 




305476 


AA745664 


Hs.287445 


305483 


AA748030 


Hs.303512 


305528 


AA769156 




305612 


AA782347 


Hs.272572 


305614 


AA782866 




305616 


AA782884 


Hs.275865 


305537 


AA806124 




305639 


AA606138 




305650 
305690 


AA807709 
AA813477 




305726 


AA828156 


Hs.73742 


305728 


AA628209 




305759 


AA835353 




305792 


AA845256 




305864 


AA864374 


Hs.73742 


305901 


AA872968 




305910 


AA875981 




305015 


AA897116 




306017 


AA897221 


Hs.109058 


306020 
306053 


M897630 
AA906316 


Hs.t30027 


305065 


AA906725 




306104 


AA910956 




306109 


AA911861 




308148 


AA917409 


Hs.288036 


305242 
306288 


AA932805 
AA936900 




306325 


AA953072 


Hs.210546 


306353 


AA961382 


Hs.275885 


306375 


AA968650 


Hs.276018 


305396 


AA970223 




306428 


AA975110 


Hs.191228 


306442 


AA976899 




306446 


AA977348 





ribosomal protein S27a 
eukaryofic translation elongation factor 
gb:FB21B7 Fetal brain, Stratagene Homo s 
gb:FB26F2 Fetal brain, Stratagene Homo s 
gb:FB7C1 Fetal brain, Stratagene Homo sa 
ribosomal protein S14 
gbyb42d06£l Stratagene fetal spleen (9 
gb7b73gOU1 Stratagene ovary (93721 7) 
gb:yc04c1ls1 Stratagene lung (937210) H 
ribosomal protein, large, PI 
gb:y!87g0ls1 Soares placenta Nb2HPHomo 
gb:ym31a06.s1 Soares infant brain 1 NIB H 
gbryr78b0S.s1 Soares fetal Over spleen 
gb:yy82d08.s1 Soares_mu!(jp!e_scierosis_ 
g b;zdBShQ6.s1 Soares JetaLrtearLNbHH 1 9W 
ribosomal protein, large, P0 
vimenlin 

proteasome (prosome, macropain) 26S sub 
gb:zp38g1ls1 Stratagene muscle 937209 H 
glyceraidehyd e-3-phosphate dehydrogenase 
gb:EST54044 Fetal heart II Homo sapiens 
gbzv26g05.s1 Soares_NhHMPu_S1 Homosapi 
gbzx82c11.s1 Soares ovary tumor NbHOT H 
gb:zx02c05.s1 Soaies_total_fetusJ!b2HFB_. 
gIyceraldehyde-3-phosphate dehydrogenase 
serine (or cysteine) proteinase inhibito 
gb;nhB5eQ8.s1 NCLCGAP_Br1.1 Homosapien 
ferritin, light polypeptide 
ribosoma) protein S23 

gbmm75h11.s1 NCLCGAP_Co9 Homo sapiens 
gb:nn13g09.s1 NCI_CGAP_Co12 Homo sapiens 
K1AA1685 protein 
PRO2047 protein 
vimentin 
ESTs 

Immunoglobulin heavy constant gamma 3 (G 
gb:zu89h08.s1 Soares_testis_NHT Homo sap 
gb:ab99c04.s1 Stratagene lung (937210) H 
gb:nr72a12.s1 NCLCGAP_Pr24 Homo sapiens 
ESTs 

gb:nt01g08.s1 NCI CGAPJ.ym3 Homo sapiens 
EST, Weakly similar to EF1 D_HUMAN ELONG 
gb;ag57d12.s1 Gessler Wilms tumor Homo s 
glyceraJdehyde-3-phosphate dehydrogenase 
gb:ag37e01.s1 Jia bone marrow stroma Horn 
nuclear factor of kappa light polypeptid 
gb:z]44fD7.s1 SoaresJetalJiver_spleen_ 
EST 

immunoglobulin heavy constant gamma 3 (G 
gb:a)10f08.s1 Soares_parathyrddJumor_N 
gb:nx10c08.s1 NCI.CGAP^GCS Homo sapiens 
hypothetical protein FU11726 
EST 

. gb:nz12e05.s1 NCLCGAPJ3CB1 Homo sapiens 
hemoglobin, alpha 2 

gb:aj09hOZs1 Soares_parathyroidJumor_N 
ribosomal protein S1 8 

gb:oe29a1ls1 NCLCGAP_Pr25 Homo sapiens 
gb:oe29c1ls1 NCLCGAP_Pr25 Homo sapiens 
gtxnw31e04.s1 Na_CGAP.GCB0Homosapiens4.49 
gb:ai67a05.s1 SoaresJestis^NHT Homo sap 
ribosomal protein, large, PO 
gb:of34a02.s1 NCLCGAPJ<id6 Homo sapiens 
gb;ak72b06.s1 Barstead spleen HPLRB2 Horn 
gbak84aC8.s1 Barstead spleen HPLRB2 Horn 
ribosomal protein, large, PO 
gb:oh63h08.s1 NCI.CGAP_Kid5 Homo sapiens 
gb:nx21h0ls1 NCLCGAP.GC3 Homo sapiens 
gb:am08b07.s1 Soares_NFUJ_GBC_S1 Homos1.56 
ribosomal protein S6 kinase, 90kO, polyp 
EST 

gb:ok03g03.s1 Soares_NFUT_GBC_S1 Homos 
gb:ok78gOZs1 NC1_CGAP_GC4 Homo sapiens 
gkokB5h11.s1 NCLCGAP_Wd3 Homo sapiens 
gb:og21a07.s1 NCLCGAP.PNS1 Homo sapiens 
tRNA isopentenylpyrophosphate transteras 
gb:oo60g04.s1 NCLCGAPJaiS Homo sapiens 
gb:oi53h05.s1 NQ_CGAP_HN3 Homo sapiens 
interleukin 21 receptor 
ribosomal protein S18 
EST, Moderately similar to JC4662 ribos 
gb:opQ9d05.s1 NCI_CGAP_Kid6 Homo sapiens 
hypothetical protein FU202B4 
gb:oq35eQ9.s1 NCI_CGAP_GC4 Homo sapiens 
gb:oq72e12.s1 NO JX^JGd6 Homo sapiens 
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AI223158 


Hs.147885 
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307517 


AI275055 




307551 


AJ281556 




307561 


AI282207 




307608 


AI290295 




307657 


AI306428 


Hs.298262 


307691 


A1318285 




307701 


A1318583 


Hs.276672 


307718 


AI333406 


Hs.83753 


307730 


AI336092 




307760 


AI3423S7 




307764 


A1342731 




307783 


AJ347274 




307796 


AI350556 




307807 


AI351799 




307808 


AI351826 




307820 


AI355761 




307830 


AI356722 


Hs.276737 


307852 


AI365541 




307902 


AI380462 




307997 


AI434512 


Hs.181165 


308002 


AI435240 


Hs.283442 


308011 


AI439473 




308023 


AI452732 


Hs.251577 


308041 


AI458824 


Hs.169476 


308059 


AI468938 


Hs.276877 


308085 


AI474135 


Hs.181165 


308101 


AI475950 


Hs.181165 


308106 


AJ476803 




308122 


AI480123 


Hs.309411 


308154 


A1500S0O 




308171 


A1523632 


Hs.298766 


308211 


AI557029 


Hs.278572 


308213 


AI557041 




308216 


AI557135 




308219 


A1557246 




308271 


A1567844 


Hs.252259 


308319 


AI583983 


Hs.181165 


308362 


AI613519 


Hs.105749 


308413 


AI636253 


Hs.196511 


308450 


AI660860 


Hs.96840 


308464 


AI672425 


Hs.277117 


308588 


A1718299 
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AI719893 




308615 


AI738593 


Hs.101774 


308643 


A1745040 




308673 


AI760854 
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A1607405 
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AI824118 
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AI832332 





gb»p33c06.s1 SoaresJJFlJLGBC.SI Homos 
ribcscmaJ protein L16a 

gb;or84d07.s1 NCLCGAP.U5 Homo sapiens 
EST, Weakly similar to RL23.HUMAN 60S R 
gb:ou57e08*1 NCI CGAP_Br2 Homo sapiens 
gb:os25c1Zs1 NCLCGAP_Wd5 Homo sapiens 
gb:os18c10.s1 NCLCGAP.KW5 Homo sapiens 
gIyceraldehyde-3- phosphate dehydrogenase 
ribosoma) protein, targe P2 
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0.86 


ESTs 


1.00 


1.00 


ESTs 


2.60 


4.21 


ESTs 


1.96 


3.49 


ESTs 


7.16 


8.32 


ESTs 


1.38 


Z28 


ESTs 


3.58 


8.13 


ESTs 


Z08 


4.92 


ESTs 


3.06 


4.79 


ESTs 


4.22 


9.21 


ESTs 


1.88 


4.15 


ESTs 


3.12 


4.55 


ESTs 


2.73 


3.34 
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317570 


AI733381 


Hs.127122 


ESTs 


1.00 


2.43 


317571 


AA938663 


Hs.1 99828 


ESTs 


5.20 


11.95 


317598 


AW206035 


Hs.192123 


ESTs 


0.33 


1.56 


317627 


AI346110 


Hs.132553 


ESTs 


1.50 


1.39 


317650 


AI733310 


Hs.1 27346 


ESTs 


0.48 


1.46 


317659 


AA961216 


Hs.127785 


ESTs 


4.18 


7.14 


317674 


AW294909 


Hs.132208 


ESTs 


2.92 


3.20 


317686 


AA969051 


Hs.187319 


ESTs 


1.00 


1.01 


317692 


AI307659 


Hs.174794 


ESTs 


5.33 


9.59 


317701 


AI674774 


Hs.1 28014 


ESTs 


1.00 


1.00 


317711 


AI733015 


Hs.272189 


ESTs 


5.13 


7 OA 

7.B1 


317722 


A1733373 


Hs.128119 


ESTs 


2.50 


6.03 


317756 


AA973667 


Hs.1 28320 


ESTs 


1.59 


1.30 


317777 


AI143525 ' 


Hs.47313 


KJAA0258 gene product 


1.00 


248 


317799 


AI498273 


Hs.1 28808 


ESTs 


1.78 


2.11 


317803 


AA983251 


Hs.1 28899 


ESTs 


0.80 


1.06 


317821 


A1368158 


Hs.70983 


PTPL1 -associated RhoGAP 1 


0.17 


0.68 


317848 


A182Q575 


Hs.129086 


Homo sapiens cDNA FU12007 fis, clone HE 


5.30 


8.16 


317850 


N29974 


Hs.152982 


hypothetical protein FU13117 


1.30 


2.28 


317861 


AW341064 


Hs.129119 


ESTs 


2.18 


5.93 


317865 


AI298794 


Hs.129130 


ESTs 


4.48 


8.20 


317869 


AW295184 


Hs.129142 


deoxyribonuclease II beta 


0.44 


0.99 


317881 


AI827248 


Hs.224398 


Homo sapiens cONA FU11469 lis, clone HE 


4.06 


Z23 


317890 


A1915599 


Hs.129225 


ESTs 


4.68 


7.48 


317899 


AI952430 


Hs.150614 


ESTs, Weakly similar to ALU4_HUMAN ALU S 


3.14 


3.37 


317986 


AI005163 


Hs.201378 


ESTs, Weakly similar to T12545 hypoM 


0.28 


1.66 


318001 


AW235697 


Hs.130980 


ESTs 


5,12 


9.97 


318016 


AI016694 


Hs.256921 


ESTs 


1.86 


4.50 


318023 


AW243058 


Hs.131155 


ESTs 


2.92 


5.22 


318054 


AW449270 


Hs.232140 


ESTs 


3.92 


6.37 


318068 


A1024540 


Hs.131574 


ESTs 


1.21 


1.27 


318117 


AI208304 


Hs.250114 


ESTs 


0.86 


1.17 


318187 


A1792585 


Hs.133272 


ESTs, Weakly similar to ALUCJiUMAN III! 


5.90 


6.98 


318223 


AI077540 


Hs.134090 


ESTs 


1.05 


0.90 


318240 


A1085377 


Hs.143610 


ESTs 


3.10 


2.40 


318255 


AI082692 


Hs.134662 


ESTs 


0.02 


1.05 


318266 


AI554341 


Hs.271443 


ESTs 


6.12 


10.55 


318330 


AI093840 


Hs.143758 


ESTs 


4.98 


7.90 


318369 


A1493501 


Hs.170974 


ESTs 


2.46 


5.62 


31842B 


AI949409 


Hs.194591 


ESTs 


0.77 


0.45 


318458 


AI149783 


Hs.1 58438 


ESTs 


3.54 


4.92 


318467 


AI151395 


Hs.144834 


ESTs 


4.56 


5.62 


318473 


AI939339 


Hs.146883 


ESTs 


2.08 


4.05 


318476 


AI693927 


Hs.265165 


ESTs 


4.22 


8.07 


318487 


AI167877 


Hs.143716 


ESTs 


1.47 


1.05 


318488 


A1217431 


Hs.144709 


ESTs 


.1.40 


4.14 


318491 


-T26477 


Hs.22883 


ESTs, Weakly similar to ALU8_HUMAN ALU S 


1.84 


1.90 


318499 


T25451 




gb:PTHI188 HTCOL1 Homo sapiens cDNA 5V3 


2.58 


5.20 


318537 


AA377908 


Hs.13254 


ESTs 


3.26 


4.18 


318538 


N28625 


Hs.74034 


Homo sapiens clone 24651 mRNA sequence 


0.35 


1.07 


318547 


R20578 


Hs.90431 


ESTs 


3.22 


4.60 


318552 


R18364 


Hs.90363 


ESTs 


4.87 


9.08 


318575 


R55102 


Hs.107761 


ESTs, Weakly similar to unnamed protein 


1.91 


1.98 


318580 


T34571 


Hs.49007 


pofy(A) polymerase alpha 


2.74 


6.22 


318587 


AA779704 


Hs.1 68830 


Homo sapiens cDNA FU12136 lis, clone MA 


0.85 


2.46 


318596 


AI470235 


Hs.172698 


EST 


4.88 


4.93 


318622 


T48325 


Hs.237658 


apoiipoproteinA-R 


4.80 


12.51 


318629 


N25163 


Hs.8861 


ESTs 


0.39 


1.04 


318637 


AA243539 


Hs.9196 


hypothetical protein 


1.72 


3.57 


318648 


T77141 


Hs.184411 


albumin 


6.27 


9.91 


318550 


AA393302 


Hs.176626 


hypothetical protein EDAG-1 


3.96 


8.84 


318671 


AA188823 


Hs.299254 


Homo sapiens cDNA: FU23597 fis. clone L 


1.53 


0.81 


318679 


T58115 


Hs.10336 


ESTs 


1.00 


2.19 


318711 


AJ936475 


Hs.101282 


Homo sapiens cDNA: FU21238 fis, done C 


3.05 


3.18 


318725 


AI962487 


Hs.242990 


ESTs 


1.08 


246 


318728 


Z30201 


Hs.291289 


ESTs, Weakly similar to ALU1_HUMAN ALU S 


0.77 


1.33 


318740 


NM.002543 Hs.77729 


oxidised low density lipoprotein (lectin 


6.25 


1.49 


318776 


R24963 


Hs.23766 


ESTs 


1.00 


3.01 


318784 


H0014B 


Hs.5181 


proliferation-associated 2G4, 38kD 


2.70 


3.86 


318816 


F07873 


Hs.21273 


ESTs 


3.90 


7.13 


318865 


H10818 




gb:ym04f10.r1 Soares Infant brain 1NIBH 


2.25 


3.56 


318879 


R56332 


Hs.18268 


adenylate kinase 5 


1.78 


5.00 


318881 


Z43224 


Hs.1 24952 


ESTs 


4.79 


14.13 


316894 


F08138 


Hs.7387 


DKFZP56481 16 protein 


5.31 


7.00 


318901 


AW368520 


Hs.301528 


l-kynurerune/a!pha^am!noadipat9 aminotra 


1.03 


0.91 


318925 


Z43577 


Hs.21470 


ESTs 


2.23 


3.80 


318936 


AI219221 


Hs.308298 . 


ESTs 


1.86 


7.16 


318982 


Z44140 


Hs.269622 


ESTs 


5.84 


9.79 


318986 


Z44186 


Hs.169161 


ESTs, Highly similar to MAON_HUMAN NADP- 


1.00 


1.00 


319041 


Z44720 


Hs.98365 


ESTs, Weakly similar to weak similarity 


3.38 


6.11 


319103 


H05896 


Hs.4993 


K1AA1313 protein 


1.00 


1.07 


319170 


R13678 


Hs.285306 


putative selenocystelne lyase 


3.79 


c no 

o.uo 


319196 


F07953 


Hs.16085 


putative G-protein coupled receptor 


1.00 


2.98 


319199 


F07361 


Hs.13306 


ESTs 


3.53 


5.66 
7.26 


319242 


F11472 


Hs.12839 


ESTs 


5.87 
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319263 


T65331 


Hs.81360 . 


Homo sapiens cDNA: FU21927 lis. dons H 


1.81 


1.57 


319267 


F11802 


Hs.6818 


ESTs 


1.10 


4.72 


319270 


R13474 


Hs.290263 


ESTs 


4.80 


10.40 


319279 


T65094 


Hs.12677 


CGI-147 protein 


1.50 


111 


319282 


AA461358 


Hs.12876 


ESTs 


1.00 


1.00 


319289 


W07304 


Hs.79059 


transforming growth factor, beta recepto 


0.18 


0.68 


319291 


W86578 


Hs.285243 


hypothetical protein FU 22029 


0.26 


0.62 


319293 


F12119 


Hs.1 2583 


ESTs 


3.13 


4.50 


319312 


Z45481 




gb:HSC2QE041 normalized infant brain cON 


1.10 


1.00 


319370 


H54254 


Hs.325823 


ESTs, Moderately similar to ALU5JWMAN A 


0.16 


0.73 


319391 


R06304 


Hs.13911 


ESTs 


1.26 


143 


319396 


H67130 


Hs.301743 


ESTs 


0.70 


0.76 


319398 


AA359754 


Hs.191198 


ESTs 


145 


3.59 


319407 


R05329 




gb:ye91 b04.r1 Soares fetal Ever spleen 


2.00 


3.54 


319425 


T32930 




gb:yd39f07ji Soares fetal Over spleen 


4.28 


8.81 


319433 


R06050 


Hs.191198 


ESTs 


6.15 


14.13 


319437 


AA282420 


Hs.1 11991 


ESTs, Weakly similar to Y48A5A.1 [Celeg 


3.26 


5.68 


319466 


AJS09937 


Hs.116417 


ESTs 


1.76 


5.65 


319471 


R06546 


Hs.19717 


ESTs 


4.29 


4.64 


319480 


R06933 


Hs.184221 


ESTs 


1.00 


1.00 


319484 


T91772 




gb:yd52a10.s1 Soares fetal liver spleen 


2.81 


4.88 


319486 


A1382429 


Hs.250799 


ESTs 


108 


182 


319508 


T99898 


Hs.270104 


ESTs, J^deratelysirnflartoALU8_HUMANA 


180 


4.39 


319523 


T69499 


Hs.191184 


ESTs 


1.55 


3.25 


319545 


R83716 


Ks.14355 


Homo sapiens cDNA FU13207 fis t clone NT 


1.65 


1.19 


319546 


R09692 




gb:yf23b1 Irl Soares fetal liver spleen 


5.11 


8.54 


319552 


AA096106 


Hs.20403 


ESTs 


1.89 


3.36 


319582 


T82998 


Hs.250154 


hypothetical protein FU 12973 


3.48 


4.82 


319586 


D78808 


Hs.283683 


chromosome 8 open reading frame 4 


0.26 


0.82 


319604 


R11679 


Hs.297753 


vimentln 


1.68 


3.41 


319609 


AW247514 


Hs.12293 


hypothetical protein FU21103 


3.06 


4.24 


319611 


H14957 




gb:ym19c10.r1 Soares infant brain 1NIB H 


2.76 


4.24 


319653 


AA770183 


Hs.173515 


uncharacterized hypothalamus protein HT0 


2.51 


3.55 


319657 


R19897 


Hs.106604 


ESTs 


5.32 


7.68 


319658 


R13432 


Hs.167481 


syntrophin, gamma 1 


3.35 


5.00 


319661 


H08035 


Hs.21398 


ESTs, Moderately similar to A Chain A, H 


5.18 


12.55 


319662 


H06382 


Hs.21400 


ESTs 


1.58 


1.56 


319708 


R15372 


Hs.22664 


ESTs 


1.00 


1.22 


319742 


T77658 


Hs.21162 


ESTs 


148 


3.13 


319748 


R18178 


Hs.295866 


Homo sapiens mRNA; cDNA DKFZp434N1923 (f 


3.02 


4.85 


319772 


R76633 


Hs.22646 


ESTs 


4.36 


11.61 


319788 


AA321932 


Hs.117414 


WAA1320 protein 


156 


3.68 


319805 


R92857 


Hs.271350 


likely orthoiog of mouse polydom 


4.63 


6.56 


319812 


N74860 


Hs.264330 


N-acylsphingosine amidohydrolase (acid c 


0.63 


1.32 


319834 


M071267 




gb:zm6 1 g01 A Stratagene fibroblast (937 


0,30 


0.94 


319878 


T78517 


Hs.13941 


ESTs 


3.99 ' 


6.44 


319882 


AA258981 


Hs.291392 


ESTs 


5.09 


7.36 


319912 


T77559 


Hs.94109 


Homo sapiens cDNA RJ13634 fis, done PL 


3.24 


3.21 


319935 


H79460 


Hs.271722 


ESTs, Weakly similar to ALU1.HUMAN ALU S 


4.40 


9.42 


319944 


T79248 


Hs.1 33510 


ESTs 


3.31 


5.39 


319947 


AA160967 


Hs.14479 


Homo sapiens cDNA FU14199 fis, clone NT 


190 


4.95 


319962 


H06350 


Hs.135056 


Human ONA sequence from clone RP5-850E9 


1.61 


1.57 


320007 


AA336314 




gb:EST40943 Endometrial tumor Homo sapie 


3.42 


6.29 


320018 


T83263 




gb:yd40h09.r1 Soares fetal liver spleen 


177 


5.14 


320030 


H63789 


Hs.296268 


ESTs, WeaWy similar to KIAA0638 protein 


4.10 


6.69 


320032 


AI699772 


Hs.292664 


ESTs, WeaWy similar to A46010 X-linked 


3.27 


3.27 


320040 


AA233671 


Hs.87164 


hypothetical protein FU14Q01 


1.81 


1.64 


320047 


T86564 


Hs.302256 


EST 


3.38 


7.36 


320083 


M074108 


Hs.120844 


FOXJ2forkhead factor 


5.90 


16.73 


320096 


K58138 


Hs.117915 


ESTs 


108 


4.47 


320099 


AW411307 


Hs.1 14311 


CDC45 (cell division cycle 45, S.cerevis 


1.00 


1.00 


320112 


T92107 


Hs.18B4B9 


. ESTs 


127 


10S 


320140 


H94179 


Hs.119023 


SMC2 (structural maintenance of chromcso 


1.00 


1.00 


320188 


AW419200 


Hs.172318 


ESTs 


1.26 


1.00 


320193 


M831259 


Hs.17132 


ESTs 


158 


6.23 


320195 


R62203 


Hs.24321 


Homo sapiens cDNA FU12028 fis, clone HE 


185 


4.53 


320199 


R78659 


Hs.29792 


ESTs 


0.40 


0.94 


320203 


AL049227 


Hs.124776 


Homo sapiens mRNA; cDNA DKFZp564N1116 (f 


0.84 


1.18 


320219 


AA327564 


Hs.127011 


tubulointerelitial nephritis antigen 


1.00 


1.17 


320220 


AF054910 


Hs.127111 


tektin 2 (testicular) 


0.18 


1.09 


320225 


AF058989 


Hs.128231 


G antigen, famBy B, 1 (prostate assoda 


5.26 


13.75 


320231 


H03139 


Hs.24683 


ESTs 


1.59 


1.93 


320260 


NNL0036Q8 Hs.131924 


G protein-coupled receptor 65 


1.38 


4.56 


320267 


AL049337 


Hs.132571 


Homo sapiens mRNA; cDNA OKFZp564P016 (fr 


1.00 


1.92 


320268 


K06019 


Hs.151293 


Homo sapiens cDNA RJ10664 fis, done NT 


5.58 


5.70 


320322 


AF077374 


Hs.139322 


small proline-rich protein 3 


1.41 


1.01 


320325 


AI167978 


Hs.139851 


caveoIln2 


0.05 


0.67 


320330 


AF026004 


Hs.141660 


chloride channel 2 


117 


1.26 


320339 


H10S07 


Hs.281434 


Homo sapiens cDNA FU 14028 fis, done HE 


1.81 


132 


320388 


H16065 


• Hs.31286 


ESTs 


1.00 


3.22 


320402 


R22291 


Hs.23368 


Homo sapiens done FLC0578 PR02852 mRNA, 


1.41 


1.36 


320413 


AA203711 


Hs.173269 


ESTs 


131 


3.61 


320432 


R62786 


Hs.124136 


ESTs 


11.25 


20.78 


320436 


AA253352 


Hs.293663 


ESTs 


122 


3.49 


320438 


W24548 


Hs.5669 


ESTs 


3.53 


8.14 
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Hs.180777 

Hs^95267 

Hs^4321 
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Hs.24743 
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Hs.324522 
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Hs.300511 

Hs.26638 
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Hs.172780 
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Hs.266416 
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Hs.34771 

Hs.135904 
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Hs.199538 
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Hs.92023 

Hs.293650 
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Hs.240770 
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Hs.241411 
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Hs.29B351 
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Hs.172982 
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Hs.251414 
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Hs.6298 
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Hs.132743 

Hs.82845 

Hs.38540 
Hs.292549 
Hs.255436 
Hs.268980 
Hs.255748 
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Hs.28803 
Hs.21858 
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Hs.42568 
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Homo sapiens cDNA RJ 12028 fis, done HE 
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hypothetical protein FU 20171 

ESTs 

ESTs 
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hypothetical protein FU 22530 
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ESTs 
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ESTs 
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poty{A}-binding protein, nuclear 1 
ESTs 

WNT1 inducible signaling pathway protein 

artemin 

claudin 14 
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H1099B 

HI 6568 

H98597 

N66563 

N75081 

R36671 

R41408 

R54797 

R61398 

T23461 

T32446 

T40769 

T82310 

M059347 

AA252079 

AA281076 

AA303125 
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Hs.2110 
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Hs.82845 
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Hs.267319 
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Hs.49282 
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karyopherin alpha 5 (Importin alpha 6) 
WAA01 03 gene product 
protease Inhibitor 3, skin-derived (SKAL 
ataxia-telangiectasia group Dissociated 
microfibrillar-associated protein 4 
endogenous retroviral protease 
guanine nucleotide binding protein (G pr 
phosphofnosilide-3-kinase, regulatory su 
S100 calcium-binding protein A4 (calcium 
zinc finger protein 9 (a cellular retrov 
tryptophan 2,3-dioxygenase 
hepatocyte nuclear factor 3, alpha 
(NONE) 

gb:Human RP1 homolog mRNA, 3UTR region 
Homo sapiens cDNA: FU21930 fts, clone H 
plasminogen activator, urokinase 
ubiquitin carboxyMerminal esterase L1 
integrin, beta 4 

reticulocalbin 2, EF-hand calcium bindin 
S100 calcium-binding protein A2 
junction plakoglobin 

ESTs, Weakly similar to ALU7_HUMAN ALU S 

ESTs 

ESTs 

Homo sapiens cDNA FU11570 fis, done HE 

Integrin, beta 8 

ESTs 

Homo sapiens voltage-gated sodium channe 
receptor {calcitonin) activity motfifying 
dTDP-D-glucose 4,6-dehydratase 
Homo sapiens cDNA FU13103 fis, clone NT 
ESTs 

hypothetical protein FU20666 

ESTs, Moderately similar to ALU7_HUMAN A 

ESTs 
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ESTs 

ESTs 

endogenous retroviral protease 
hypothetical protein FU14033 simflar to 
ESTs 

a disintegrin and metalloproteinase doma 
ESTs 

hypothetical protein KIAA1 165 
ESTs 
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glycoprotein (transmembrane) nmb 
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ESTs 

Homo sapiens cDNA FU13496 fis, clone PL 
WAA1462 protein 

anterior gradient 2 (Xenepus laevis) horn 
hypothetical protein FU11088 
NAOPH oxidase 4 
ESTs 

ESTs, Moderately similar to ALU7_HUMAN 
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331578 
331589 
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332029 
332033 
332048 
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332085 
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Hs.152213 
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Hs.104072 
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Hs.97996 

Hs.97901 

H8.98314 

Hs.21275 

Hs.82772 

Hs.1 39631 
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Hs.65641 

Hs.145053 
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Hs.155546 
Hs.173933 
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Hs.26530 
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Hs.111758 
Hs.250700 
Hs.15106 
Hs.278430 
Hs.1735 
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Hs.20183 
Hs.166189 
. Hs.274407 
Hs.25272 
Hs.3239 
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Hs.50640 
Hs.5101 
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Hs.63788 
Hs.247926 
Hs.79070 
Hs.114765 
Hs.296938 
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gb:od74f04.s1 NCI_CGAP_Ov2 Homo sapiens 
ESTs 
ESTs 

PTD007 protein 

EST 

EST 

ras homoJog gene famfly, member I 
ESTs, WeaWy slmilaT to rhotekin [M.musc 
collagen, type III, alpha 1 (Ehlers-Oanl 
wingless-type MMTV integration site fami 
Homo sapiens NY-REN-62 antigen mRNA, par 
ESTs 
ESTs 

transcription termination factor, mi toe 
EST 

Homo sapiens mRNA; cDNA DKFZp586L0120 (f 

hypothetical protein FU 11 011 

collagen, type XI, alpha 1 

ESTs 

ESTs 

hypothetical protein FU 20073 

ESTs 

EST 

ESTs 

WAA1211 protein 

gb:ae41e11.s1 Gassier Wilms tumor Homos 

KIAA1080 protein; Golgi-associated, gamm 

nuclear factor l/A 

ESTs 

ESTs 
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EST 

Stargardt disease 3 (autosomal dominant) 
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serum deprivation response (phosphatidyl 
RNA binding motif protein, X chromosome 
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ESTs 

hypothetical protein FU23045 

retinol-binding protein 1, cellular 

Homo sapiens cDNA FU11918 fis, clone HE 

ESTs 

keratin 6A 

try ptase beta 1 

chromosome 14 open reading frame 1 
cytochrome P450, subfamily XX1A (steroid 
inhibin, beta B (acbVin AB beta pohypep 
cystelne-rich motor neuron 1 
ESTs, Weakly similar to AF164793 1 prote 
cytokeratin 2 

protease, serine, 16 (thymus) 

E1 A binding protein p300 

methyl CpG binding protein 2 (Rett syndr 

tenascln XA 

JAK binding protein 

protein regulator of cytokinesis 1 

hypothetical protein MGC2941 

propionyl Coenzyme A carboxylase, beta p 

gap junction protein, alpha 5, 40kO (con 

v-myc avian myelocytomatosis viral oncog 

myeloid/lymphoid or mixed-lineage leukem 

dual specificity phosphatase 7 

hypothetical protein FU10902 
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TABLE 8B shows the accession numbers for those Pkeys In Table 8A lacking unigenelCrs. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 



322044 


167363J 


AW340926AA249063 N86075 


322060 


44320.1 


A1341937AW003063 U34725AA904742 


321430 


42705 1 


X57414 X57415 


321467 


43034J 


X13075X13076 


322125 


48779 1 


R93901AF075073 R93902 


322166 


46861 1 


H69434AF085958H69846 


322173 


46873 1 


H52567 H52557 AF085970 H52164 


322178 


46882.1 


H56535AF085980 H56712 


322179 


46885J 


H92891AF085982 H92777 


321577 


1615102J 


H84849 H84252 H84260 H86664 HB5320 


321587 


1615333.1 


H95531 H95521 H84529 


313723 


11 1953.1 


AA070412 AA102346 AA081885 


320997 


627492L1 


H22544 H46842 AI204929 


322278 


47271J 


W69304AF086283 W69200 


321687 


218439.1 


AA625149 AA3 1 3030 AA3 13052 H97463 


313883 


129439.1 


AA665QB9AA135130AA484Q59M102419AW877765 


322320 


47422.1 


W79150AF086419 


322339 


814584.1 


AI668646 A1734214 W17348 


314648 


.293660.1 


AW979268 AA878419 AA431342 AA431628 


300201 


682222 1 


AI308300AI308296 


306897 


25196.-2 


AI093967 


323155 


979809.1 


AL120701 AL135041 AL121524 


322527 


38927.1 


AF147359T58511T58560 


322585 


473768J 


W88919W89125 


300362 


1574395.1 


242308 H23514 


322635 


82296.1 


AA005129 AA679084 AA694399 


322664 


85042.1 


AA011522 AA702841 AA011691 AA330797 


315454 


380580.1 


A1239464 A1239473 AA625812 A1208703 


322687 


37372J 


AF074666 AJ1 10759 AF090902 


314852 


327472.1 


AI903735 AA491283 AI694953 AW976903 AA761362 


307783 


697809 1 


AI347274AW844024 


324072 


269032J 


AA381722 AA381829 AW963905 AW963902 AA381242 


300527 


221345.1 


AA488472 W27363 AA317053 BE082689 AW987036 BE079872 


323505 


196389J 


AW970512 AA280251 A1652287 BE466438 AI650725 AA551854 AA281574 AW571481 


315791 


403558J 


AA678177AA677034 


324303 


233842.1 


AL118754AA333202 H38001 


316519 


442885J 


AA847835AA768376 


300926 


333127.1 


AA504860AA504911 



'Accession' column. 



Pkey. Unique Eos probeset identifier number 
CAT number Gene cluster number 
Accession: Genbank accession numbers 
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TABLE 8C shows the genomic position for those Pteys In Table 8A lacking unfgene lUs end accession numbers. For each predated exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also listed. 

Pkey: Unique number corresponding to an Eos probeset 

Ret Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. Dunham I. et aT refers to the pubScafon entitled The DNA 

sequence of human chromosome 22/ Dunham I. et at, Nature (1999) 402:489-495. 
Strand: Inarcates DNA strand from which axons were predicted. 
NLposition: Indicates nucleotide positions of predicted axons. 



Pkey 


Ref Strand 


NLposlfi 


on 


332792 


Dunham, 1. etal 


Plus 


73381-73768 


332816 


Dunham, 1. etal 


Plus 


359844-360030 


332906 


Dunham, 1. etal 


Plus 


1923101-1923205 


332911 


Dunham, 1. etal 


Plus 


1961767-1961858 


332912 


Dunham, 1. etal 


Phis 


1962120-1962246 


332922 


Dunham, 1. etaL 


Plus 


2009620-2009738 


332956 


Dunham, 1. etal. 


Pius 


2510528-2510658 


332959 


Dunham, 1. etal. 


Plus 


2518145-2518213 


333138 


Dunham, 1. etal. 


Plus 


3369205-3369323 


333139 


Dunham, 1. etal 


Plus 


3369495-3369571 


333221 


Dunham, 1. etal 


■ Plus 


3978070-3978187 


333380 


Dunham, 1. etal. 


Plus 


4904775-4904846 


333387 


Dunham, 1. etal. 


Plus 


49109354910997 


333512 


Dunham, I. etal 


Pius 


5560510-5560564 


333524 


Dunham, 1. etal. 


Plus 


5612620-5612780 


333585 


Dunham, 1. etal. 


•Plus 


6234778-6234894 


333618 


Dunham, 1. etal. 


Plus 


6562391-6562566 


333627 


Dunham, 1. etal 


Plus 


6620584-6620903 


333626 


Dunham, 1. etal 


Plus 


6629004-6629233 


333650 ' 


Dunham, I. etal. 


Plus 


6795852-6797128 


333678 


Dunham, 1. etal. 


Plus 


706B223-7068288 


333750 


Dunham, 1. etal. 


Plus 


760B1 65-7608234 


333763 


Dunham, 1. etal. 


Plus 


7692491-7692630 


333767 


Dunham, 1. etal 


Plus 


7694407-7694623 


333768 


Dunham, 1. eta). 


Plus 


7695440-7695697 


333769 


Dunham, 1. etal. 


Plus 


7696625-7696707 


333772 


Dunham, 1. etal. 


Plus 


7706773-7706902 


333777 


Dunham, 1. etal 


Plus 


7746805-7746916 


333846 


Dunham, 1. etal. 


Plus 


8008623-8008757 


333884 


Dunham, 1. eta). 


Plus 


8153960-8154161 


333887 


Dunham, 1. etal. 


Pius 


81548B2-8155025 


333891 


Dunham, t. eta). 


Plus 


8156437-8156709 


333892 


Dunham, 1. etal. 


Plus 


8156825-8157001 


333948 


Dunham, 1. eta). 


Plus 


8583497-8583627 


333954 


Dunham, I. etal. 


Plus 


6563186-6563335 


333966 


Dunham, 1. eta). 


Plus 


8655643-8655826 


333968 


Dunham, 1. etal. 


Plus 


8681004-8681241 


334061 


Dunham, 1. etal 


Plus 


9686941-9687077 


334094 


Dunham, I etal 


Plus 


9889953-9890105 


334113 


Dunham, 1. etal. 


Plus 


10282459-10282597 


334161 


Dunham, 1. etal 


Plus 


10599033-10599180 


334219 


Dunham, 1. eta). 


Plus 


12716160-12716384 


334239 


Dunham, 1 eta). 


■ Plus 


13056569-13056693 


334333 


Dunham, 1. etal. 


Plus 


13603544-13603657 


334378 


Dunham, 1. etal 


Plus 


13907239-13907370 


334382 


Dunham, 1. eta). 


Pius 


13915866-13916036 


334562 


Dunham, 1. etal. 


Pius 


14987847-14987940 


334588 


Dunham, 1. eta). 


Plus 


15032740-15032817 


334616 


Dunham, 1. etaL 


Plus 


15176123-15176470 


334633 


Dunham, 1. etal. 


Plus 


15333205-15333305 


334866 


Dunham, 1. etal 


Pius 


18872214-18872317 


334891 


Dunham, 1. etal. 


Phis 


19299770-19299944 


334934 


Dunham, 1. eta). 


Plus 


20103970-20104058 


335015 


Dunham, 1. etal. 


. Plus 


20682792-20682945 


335120 


Dunham, 1. etal 


Plus 


21436286-21436384 


335125 


Dunham,! etal 


Plus 


21441390-21441471 


335179 


Dunham, 1. etal 


Plus 


21634405-21634526 


335188 


Dunham 1. etal. 


Plus 


21669116-21669326 


335211 


Dunham, 1. etal. 


Plus 


21774611-21774680 


335361 


Dunham, 1. etal. 


Plus 


22807292-22807445 


335379 


Dunham, I. etal. 


Plus 


22899306-22899420 


335414 


Dunham, 1. etal 


Plus 


23235546-23235684 


335416 


Dunham, 1. eta). 


Plus 


23237354-23237465 


335496 


Dunham, 1. etal 


Plus 


24164386-24164545 


335497 


Dunham, 1. etaL 


Plus 


24167666-24167869 


335558 


Dunham.), etal. 


Pius 


24740167-24740347 


335586 


Dunham, 1. eta). 


Phis 


24990333-24990497 


335686 


Dunham, 1. etal 


Plus 


25439839-2543992) 


335784 


Dunham, t eta). 


Plus 


25942710-25942792 


335823 


Dunham, 1. etal 


Plus 


26365925-26366004 


335983 


Dunham, 1. eta). 


Plus 


27938968-27939070 


335995 


Dunham, L etal. 


Plus 


28009044-2B009184 


336021 


Dunham, 1. etal 


Plus 


28686482-28686559 
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336034 


Dunham, 1. etai 


rlUS 


336038 


Dunham, I etal 


Phis 


336107 


Dunham, 1. etal. 


rlUS 


336632 


Dunham, L etat 


Pius 


336633 


Dunham, 1. etal 


Plus 


336634 


Dunham, 1. etal. 


rtus 


336635 


Dunham, 1. etal. 


rlUS 


336636- 


Dunham, 1. etat 


Plus 


336637 


Dunham, 1. etat 


Plus 


336638 


Dunham, i. etal. 


Phis 


336659 


Dunham, 1. eta). 


Pine 


336694 


Dunham. 1. etal. 


Plus 


336721 


Dunham, 1. eta). 


Plus 


336900 


Dunham, 1. etat. 


Plus 


336948 


Dunham, tetat 


Pine 


337028 


Dunham, 1. eta). 


Phic 


337054 


Dunham, 1. etal. 


PI ite 
rlUS 


337162 


Dunham, 1. eta). 


Pine 
rius 


337183 


Dunham, 1. etat 


Pine 

riUS 


337184 


Dunham, teta). 


Phic 
rlUS 


337268 


Dunham,!, etal. 


Plus 


337299 


Dunham, l.etat 


Pine 
rlUS 


337389 


Dunham, 1. eta). 


Pino 
rlUS 


337493 


Dunham, I. etal. 


Phic 
rlUS 


337549 


Dunham, 1. etai. 


PIlIC 

rius 


337755 


Dunham, 1. etal. 


Phic 
rlUS 


337809 


Dunham, 1. eta). 


Pine 

nub 


337871 


Dunham, 1. etal. 


Plus 


337958 


Dunham, 1. eta). 


Pine 
riUS 


338008 


Dunham, i. etal. 


rlUS 


338033 


Dunham, 1. eta). 


Pine 
rlUS 


338110 


Dunham, 1. eta). 


Pine 
rtus 


338112 


Dunham, 1. etal. 


Dine 

rlUS 


338145 


Dunham, 1. etat 


Dine 

rlUS 


338148 


Dunham, 1. eta). 


Pius 


338179 


Dunham, ). eta). 


Pltte 
rlUS 


338197 


Dunham, t etal. 


Plus 


338279 


Dunham, teta). 


Plus 


338316 


Dunham, ). etat 


Pine 
rlUa 


338322 


Dunham, teta). 


PflfC 

rlUS 


338357 


Dunham, 1. etal. 


Ptite 


338359 


Dunham,!, etal. 


Plus 


338366 


Dunham, 1. etal. 


Plus 


338374 


Dunham, teta!. 


Plus 


338414 


Dunham, l.etat 


Plus 


338418 


Dunham, teta). 


Plus 


338501 


Dunham, ). eta). 


riUS 


336506 


Dunham, teta). 


Plite 

rlUS 


33B523 


Dunham, 1. etat 


rlUS 


338662 


Dunham, 1. eta). 


Phic 


338BD4 


Dunham, t eta). 


rJUS 


338836 


Dunham, total. 


Oh id 

rlUS 


338879 


Dunham, 1. etal. 


Plus 


338937 


Dunham, 1. etat 


Plus 


338993 


Dunham, 1. eta). 


rlUS 


339047 


Dunham, 1. etal. 


Dine 

rlUS 


339100 


Dunham, 1. eta). 


PI IIC 

rlUS 


339114 


Dunham, 1. etal. 


Phic 


339121 


Dunham, 1. eta). 


riU5 


339170 


Dunham, 1. etal. 


Plus 


339293 


Dunham, tetat 


Plus 


332858 


Dunham, t eta). 


Minus 


332982 


Dunham, 1. etal. 


Minus 


332984 


Dunham, 1. eta). 


Minus 


332998 


Dunham, ). eta). 


Minus 


333058 


Dunham, teta). 


-Minus 


333097 


Dunham, 1. etal. 


Minus 


333121 


Dunham, t etal. 


Minus 


333122 


Dunham, tetat 


Minus 


333123 


Dunham, 1. eta). 


Minus 


333140 


Dunham, 1. eta). 


Minus 


333260 


Dunham, 1. eta). 


Minus 


333603 


Dunham, 1. etat 


Minus 


333604 


Dunham, t etat 


Minus 


333904 


Dunham, teta). 


Minus 


333906 


Dunham, 1. etal. 


Mmitc 
IV1U1U5 


334183 


Dunham, 1. etat 


Minus 


334187 


Dunham, tetat 


Minus 


334222 


Dunham, 1. etal. 


Minus 


334223 


Dunham, l.etat. 


Minus 


334255 


Dunham, tetat 


ivunus 


334492 


Dunham, teta). 


Minus 


334648 


Dunham, teta). 


Minus 


334787 


Dunham, t etat 


Minus 


334933 


Dunham, tetat. 


Minus 



29014404-29014590 
29022963-29023165 
29987731-29987869 
983890*985529 
985591-988221 
985296-986670 
987908-988364 
988418-989185 
989276-990813 
991906-993240 
1896402-1896478 
2420546-2420616 
3371522-3371586 
10236423-10236523 
12692290-12692381 
16644817-16644942 
17821742-17821922 
23478943-23479145 
23943606-23943696 
23973949-23974016 
28011979-28012034 
29022656-29022775 
31401509-31401579 
33330760-33330981 
3447447244474531 
39717644971900 
44490694449193 
5443027-5443101 
6969162-6969270 
7697068-7697236 
8092128-8092271 
10384481-10384621 
10391398-10391600 
11386629-11386692 
11448985-11449085 
12808775-12808833 
13638107-13638181 
16168944-16169091 
17089711-17089988 
17132477-17132547 
18062184-18062402 
18074402-18074501 
18252026-18252189 
16371200-18371282 
19345573-19345660 
19435506-19435596 
21244713-21244828 
21221871-21221953 
21509763-21509864 
24404720-24404899 
27236005-27236108 
27792166-27792272 
28410553-28410734 
29160555-29160725 
3007778740078184 
30760793-30760968 
31141580-31141765 
31456454-31456519 
31563467-31583536 
32216399-32216527 
33223671-33223819 
1339607-1339397 
2628296-2628109 
2632606-2632457 
27.11704-2711565 
■3028925-3028811 
3204124-3204036 
33084464308358 
33095964309531 
33108174310749 
3377220-3376309 
430840CM308304 
64663354465727 
6467090-6466768 
8217374-8217261 
821B238-8218063 
11832582-11832508 
11921456-11921205 
12732417-12732289 
12734365-12734269 
13200776-13200592 
14478333-14478172 
15363301-15363222 
16299093-16298937 
20078117-20077991 
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334945 


Dunham, !. eLaL 


Minus 


9A19QC97 

£U13lw3-ZUl3co3/ 


334957 


Dunham, I. eLaL 


minus 


OrHTT31 \ 0A17091Q 
iUl f JOl 1-ZU1 1 OCX a 


334990 


Dunnam, L eLaL 


Mnus 


"}{\1A 1 1 CO OftlA 1 fl Q7 

ZUJ4 1 loy-ZU341 uov 


335093 


Dunham, 1. etal. 


Minus 


91 9079C7 91OO701JI 

c\£of oof -ZlZy/^14 


335288 


Dunham, 1. etal 


Minus 


009A407C OOTAOT7A 

Z23U42/ 3-2Z3U3r /U 


335289 


Dunnam, 1. eLaL 


Minus 


9O9AC0CA 093AC7flC 

Zz3u5yoU-ZZ3U5 fUO 


335548 


Dunham, 1. etal. 


Minus 


0vlCM770 O>ICC0C7O 
24002/ / 3-Z40020/ 3 


335551 


Dunham, 1. eta). 


Minus 


OA R70 Q9Q 9/ C70Q Ci 
Z40 /90ZO-Z4D/03D 1 


335619 


Dunham, I. eLaL 


Minus 


ZOUOZOf /-Z0UOZ4 yo 


335620 


Dunham, 1. eLaL 


Minus 


9CAQ9CR1 9CA09A9A 
ZOUyzOOl -Z3U92434 


335621 


Dunnam, I. eLaL 


Minus 


9SrtQflft7fl_9 <?n0fl7fi7 
ZDUHiSO/ 0-c.OUju 10/ 


335682 


Dunham, I. eLaL 


Minus 


Z04/ 1 Zl 1 U30 


335755 


Dunham, 1. eLaL 


Minus 


£5/03oUO-ZOro3/ hi 


335814 


Dunham, 1. eLaL 


Minus 


Z03ZUU43-ZD31 a MO 


335815 


Dunham, 1. eLaL 


Minus 


ZO JZUO 1 


335835 


n. 1 .L. I> , i _i 

Dunham, 1. eLaL 


Minus 


COSH J Jl 1 -^03?3<:40 


335851 


Dunnam, I. eLaL 


Minus 


ZOOU1003-ZOOU4/ 4Z 


335868 


Dunnam, L eLaL 


Minus 


9R71 1/Q7 9R71 1 "JOft 
ZD/ 1 1 43 /-ZD/ 1 1 3UU 


335696 


Dunham, 1. eLaL 


Minus 


ZDS / f OJ9*Z0b / / 300 


335936 


Dunham, |. eLaL 


Minus 


07*5Cn>l7>L07'5<;ni4rtft 
Z/3oU4/4-Z/30U4UU • 


335948 


Dunham, 1. eLaL 


Minus 


Z/ 0333Z4-Z / 333 / 00 


336066 


Dunham, 1. eLaL 


Minus 


WiA i HD/l 901^ F\QA 9 


336205 


Dunham, 1. eLaL 


Minus 


30477456^047731 1 


336275 


Dunham, 1. eLaL 


Minus 


99naCfi7C99nQCC9C 
OZUOOO / 3-3ZU0D33D 


336292 


Dunham, 1. eLaL 


Minus 


3zol 0035-3^61 7927 


336331 


Dunham, i. eLaL 


Minus 


333y 452 / -335S43 / 1 


335419 


Dunham, 1. etaL 


Minus 


34UDZ3DO-34U52443 


336675 


Dunham, l. eLaL 


Minus 


9ft9n7ciL9n9ncoi 
ZU2U / OO-2UZU004 


335684 


Dunham, 1. etal. 


Minus 


91 RQnfiO-91 R70Q9 
Zl 30U0U-<(1 o/y yj 


336716 


Dunham, L etal. 


Minus 


ozo??dz-ozoyoo^ 


336798 


Dunham, L etal. 


Minus 


300 oyo4-oo 00 / 3/ 


337043 


Dunham, 1. etal. 


Minus 


1 7/n799f\_1 7/079C'! 

1 / 4U/33U-1 /4U/231 


337046 


Dunham, I. etal. 


Minus 


i /o iuoyz-i /DIUOZI 


337128 


Dunham, L eLaL 


Minus 


9991 99015094 
ZZZ1 3Z31-ZZZ10U34 


337192 


Dunham, 1. etal, 


Minus 


Z43y lo33-Z43sl f /I 


337194 


Dunham, 1. etal. 


minus 


Z4o 1 05 1 U-z4olU359 


337229 


Dunham, L eLaL 


Minus 


9C71 CC7G^9G71 CA Q 1 
ZO f 1 03/y-ZO / 1 040 1 


337325 


Dunham, 1. eLaL 


Minus 


CO>t Q_9Aft1 CflAn 
3UU1 3y40-3UUl 30UU 


337497 


Dunham, 1. etaL 


Minus 


99971917 99971 9KQ 
333/ 131/ -333/ 1Z30 


337500 


Dunham, 1. etal. 


Minus 


9997C919 9997C1CQ 
J /DZ1 Z-OJJ/D J 30 


337603 


Dunham, I. eLaL 


Minus 


19QOOOC 10QO1Q>l 

izyy290-izyyiy4 


337605 


Dunham, 1. eLaL 


Minus 


19XRCCC 19X^307 

1340555-1340397 


337671 


Dunham, 1. eLaL 


Minus 


99finfi9X_99Rnoi7 
3Z0UD34-3Z0U34/ 


337786 


Dunham, 1. eta). 


Minus 


41 332U3-41 33UO 1 


337862 


Dunnam, I. eLaL 


Minus 


034 / O3o- 034 / 55U 


33B083 


Dunnam, I. eta. 


Minus 




338158 


Dunham, 1. eLa). 


Minus 


1 1 7Q AAR R_1 1 70J9A9 
1 1 /y4403-l 1 /94343 


338161 


Dunham, L etal. 


Mimic 

Minus 


1 01 0/71 C_1 01 O^RCQ 
1 Z1Z4/1 D-l Zl Z4000 


338182 


Dunham, 1. eLat 


Minus 


1 9R9ZQ1 Q.1 0P94A97 

i zoz4yi y-i Z0Z4O// 


338189 


Dunnam, 1. etaL 


Minus 


19B7PC04 1907D>17Q 

lZo/o5y4-lZo/o4/0 


338199 


Dunham, 1. eta). 


Minus 


1 0 f DUoOO-l J / OU / OU 


338215 


Dunham, L et.aL 


Minus 


14U5344 /-1 4U33335 


338469 


Dunham, 1. eta). 


Minus 


9ACOAQQ7 90000^9 

ZU3ZU30 /-ZU52UZ4Z 


338549 


Dunnam, L etal. 


Minus 


99Ail0171 99A/fQA01 

zzu4y i / 1 -zztwyuo 1 


338561 


Dunham, !. etal. 


Minus 


0911 40Ce_OOQ440CC 

ZZ31 labb-223 » loob 


338671 


Dunham, L eta). 


Minus 


9/1CAa>l91 9>tC009^fi 

Z450842 1 -Z45Uo340 


338676 


Dunham, 1. eta). 


Minus 


OARnAOT 9>tC179CO 

2403/42/ -2403/30? 


338726 


Dunham, 1. etal. 


Minus 


9C09ft9nC_9C09CC1 0 

zoyzozuo-iioyzoDi o 


338779 


Dunham, 1. etal. 


Minus 


07A9A1C1 07A0070C 

z/ujuioi-t/uzy/yo 


338871 


Dunham, L etal. 


Minus 


Z03U 1 /UO-203U loll 


338872 


Dunham, 1. eta). 


Minus 


90900011 9Q00rt7OA 

zoouuy^ i *2o3uu/ yu 


338966 


Dunham, 1. etal. 


Minus 


9QC1>I07C 9QC1X7XQ 

co oi 4o /o-zy oi 4 /4y 


339229 


Dunham, 1. etal. 


Minus 


007*>*M^A 0070*5100 

3272233U-327221 99 


339264 


Dunham, L eta!. 


Minus 


32975145-32975053 


325228 


6381940 Plus 


2630-2694 




325235 


ron4Ain i jt» 

.6381943 Minus 


162154162264 


329568 


3962484 Plus 


1169-1619 




329560 


3962491 Phis 


2095-2990 




329541 


3983503 Minus 


2765-3059 




325328 


5865875 Pius 


86780-86854 




325340 


6017033 Minus 


166656-166819 


325373 


5866920 Minus 


1136686-1136777 


325367 


5866920 Minus 


922881-922958 


325389 


5866921 Pnis 


239672-239759 


325436 


5666939 Minus 


29778-29907 




325498 


5866967 Plus 


173372-173930 


325471 


6017034 Minus 


289268-289342 


325557 


6056302 Phis 


50921-51050 




325559 


6249595 Minus 


118590-119172 


325560 


6249595 Minus 


133794-133981 


325569 


6249599 Plus 


79927-80217 




0&330/ 


0002402 PIUS 


126724-126967 


325585 


6682462 Phis 


73476-73574 




325597 


5866992 Rus 


1065020-1065089 


325639 


5867002 Rus 


253525-253608 
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325739 


5867038 


Minus 


205138-205269 




325740 


5867038 


Minus 


207533-207690 




325792 


6469828 


' Minus 


1018-1176 




325735 


6552447 


Minus 


269122-269190 


c 
J 


325685 


6682468 


Plus 


117397-117483 




325586 


6682468 


Plus 


118337-118439 




325819 


6682490 


Minus 


130314-130370 




329764 


6048195 


Minus 


1 09733-1 099E8 


in 


329703 


6065793 


Minus 


m MAAA 1 1 

139994-140138 


329543 


6448539 


Plus 


53403-53537 




329816 


6624888 


Minus 


70296-70423 




329860 


6687260 


Minus 


163474-163605 




325883 


5867087 


Plus 
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TABLE 9A: Potential Therapeutic, Diagnostic and Prognostic targets for Therapy of Lung Cancer 

Table 9A shows about 1312 genes up-reguteted in lung tumors (including squamous cell carcinomas, adertocarcfnomas, small cell carcinomas, granulomatous and carcinoid 
tumors) relative to normal body tissues. These genes were selected from about 59680 probesets on the Eos/Afiymetrix Hu03 Genechip array. 

Table 9B show the accession numbers for those Pke/s lacking UnigenelD's for table 9A. For each probeset we have listed the gene duster number from which the 
oligonucleotides were designed. Gene dusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed In the 
'Accession' column. 

Table 9C show the genomic positioning for those Pke/s lacking Unlgene ID'S and accession numbers in table 9A. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also listed. 
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Pkey; Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1: Average of lung tumors {including squamous cell carcinomas, adenocarcinomas, small cell carcinomas, granulomatous and carcinoid tumors) divided by the 
average of norma) lung samples 

R2 Average of non-malignant lung disease samples flncludlng bronchitis, emphysema, fibrosis, atelectasis, asthma) divided by the average of normal lung samples 



400195 
400205 
400220 
400277 
400285 
400288 
400289 
400298 
400301 
400303 
400328 
400419 
400512 
400517 
400560 
400564 
400665 
400666 
400749 
400763 
401027 
401093 
401203 
401212 
401411 
401435 
401464 
401714 
401747 
401760 
401780 
401781 
401785 
401797 
401961 
401985 
401994 
402075 
402260 
402265 
402297 
402408 
402420 
402674 
402802 



ExAccn UnigenelD 



X06256 

XQ782Q 

AA032279 

X03635 

AA242758 

X87344 

AF084545 

AF242388 



AF039241 



AF053004 



Hs.149609 

Hs.2258 

Hs.61635 

Hs.1657 

Hs.79136 

Hs.180062 



403137 
403306 
403329 
403381 
403478 
403485 
403827 
403715 
404044 
404076 
404101 
404140 
404165 
404185 
404210 
404253 



NMJJQ6825 



Unigene Title 


R1 


R2 


NMJ)07057*:Horno sapiens ZW10 interactor 


1.00 


1.00 


NMJ)06265*:Homo sapiens RAD21 (S. pombe) 


15.80 


396.00 


Eos Control 


Z28 


2.84 


Eos Control 


7.68 


9.72 


Eos Control 


1.00 


1.00 


integrin, alpha 5 (fibronectin receptor, 


1.04 


2.24 


matrix metalloproteinase 10 {sbomelysln 


132.45 


4.00 


six transmembrane epithelial antigen of 


43.66 


74.00 


estrogen receptor 1 


1.00 


1.00 


LIV-1 protein, estrogen regulated 


1.75 


1.65 


transporter 2, ATP-binding cassette, sub 


0.87 


1.80 


Target 


156.55 


253.00 


NML_030878':Homo sapiens cytochrome P450. 


1.00 


2.00 


lengsln 


3.67 


87.00 


NM^030878*:Homo sapiens cytochrome P450, 


1.00 


1.00 


NM_0Q2425;Homo sapiens matrix metaliopro 


20.26 


45.00 


NM_002425:Hcmo sapiens matrix metaliopro 


1.36 


1.07 


NM_002425:Horno sapiens matrix metaliopro 


3.26 


3.22 


NMJ)03105 # :Homo sapiens sortilin-related 


1.00 


91.00 


Target Exon 


7.63 


24.00 


Target Exon 


1.00 


1.00 


C1 2000586*:glI63301 67ldbj|BAA86477.1 1 (A 


1.00 


155.00 


Target Exon 


1.00 


86.00 


C12000457 4 :gi|7512178|plr|IT30337 polypr 


1.00 


400.00 


ENSP00000247172*:HYPOTHETICAL 126.2kDa 


1.00 


72.00 


d400039r:giI7499898ipir||T33295 hypolh 


1.00 


64.00 


histone deacetylase 5 


3.82 


49.00 


ENSP00000241802*:CDNA FU11007 FIS, CLON 


2.02 


40.00 


Homo sapiens keratin 17 (KRT17) 


128.43 


68.00 


Target Exon 


1.74 


35.00 


NM_005557*:Homo sapiens keratin 16 (foca 


26.47 


10.50 


Target Exon 


10.33 


4.61 


NM.002275*:Homo sapiens keratin 15 (KRT1 


4.13 


Z7Q 


Target Exon 


1.44 


2.10 


NMJ)21626:Homo sapiens serine carboxypep 


1.41. 


1.86 


dass I cytokine receptor 


1.00 


177.00 


Target Exon 


61.84 


47.00 


ENSP00000251056*:Plasma membrane cahaum 


1.00 


1.00 


NM_001436*:Homo sapiens fibrOiarin (FBL 


1.58 


1.39 


Target Exon 


2.09 


35.00 


Target Exon 


1.00 


92.00 


NM 030920*:Homo sapiens hypothetical pro 


28.87 


13.00 


C1000823*:gil10432400|emb|CAC10290.1| (A 


1.00 


1.44 


Target Exon 


7.44 


243.00 


NMJ)01397:Homo sapiens endothelin conver 


1.00 


70.00 


NMJ)02463*:Homo sapiens myxovirus (influ 


1.37 


1.43 


NM_005381 # :Homo sapiens nudeofin (NCL). 


1.00 


19.00 


transmembrane protein (63kD), endoplasmi 


1.00 


43.00 


Target Exon 


1.00 


61.00 


ENSP00000231844*:Ecotropic virus Integra 


1.00 


119.00 


NH.022342:Homo sapiens kinesin protein 9 


28.13 


136.00 


C3001813 t :gi|12737279|rellXP.012163.1| k 


2053 


76.00 


Target Exon 


6.30 


29.33 


Target Exon 


1.30 


35.00 


ENSP0O0QO237855*:DJ398G3.2 (NOVEL PROTEI 


1.00 


54.00 


NMJ)16020*:Homo sapiens CGl-75 protein ( 


1459 


91.00 


C8000960rgil423560lplrt[A4731 8 RNA-bindi 


1.00 


1.00 


NM.00851 0:Homo sapiens ret finger prote) 


1.42 


1.44 


ENSP000Q0244562:NRH dehydrogenase [quino 


1.00 


54.00 


Target Exon 


1.00 


117.00 


NMJ)05936:Homo sapiens myeloid/lymphoid 


5.93 


13.77 


NM_021058*:Homo sapiens H2B histone (and 


1.00 


1.00 
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405449 
405568 
405572 
405646 
405676 
405770 
405932 
406137 
406360 



408467 
406621 
406542 
406663 
406671 
406673 
406676 
406678 



X57809 

AJ245210 

U24683 

AA129547 

M34996 



406698 
406815 
406851 
406964 
405967 
406974 
407103 
407128 
407137 
407168 
407239 
407242 
407244 
407289 
407300 
407366 
407378 
407430 
407453 
407577 
407634 
407710 
407720 
407746 
407756 
407758 
407782 
407788 
407790 
407811 
407839 
407944 
408000 
408031 
408063 
408070 
408101 
408122 
408212 
408243 
408349 
408353 
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408369 
408380 
408482 
408522 
408536 
408545 
408572 
408633 
408660 
408761 
408771 



U77534 

M18728 

M31126 

M29540 

X03068 

AA833930 

AA609784 

M21305 

M24349 

M57293 

AA424881 

R83312 

T97307 

R45175 

AA076350 

M18728 

M10014 

M135159 

M102616 

AF026942 

AA299264 

AF169351 

AJ132087 

AW131324 

AW016569 

AW022727 

AB037776 

AK001982 

M1 16021 

D50915 

AA608956 

6E514982 

AI027274 

AW190902 

AA045144 

R34008 

L1 1690 

M081395 

BE086548 

AW148852 

AW966504 

AI432652 

AA297567 

Y00787 

BE546947 

BE439838 

AI382803 

R38438 

AF123050 

NMJW0676 

A1541214 

AW381532 

AW235405 

M055611 

AW963372 

AA525775 

AA057264 

AW732573 



Ks.181125 

Hs.293441 
Hs.285754 
Hs.198253 
Hs.81221 



Hs.272822 
Hs.220529 
Hs.73931 
Hs.288036 



Hs.256301 
Hs.237260 

Hs.1 17183 
Hs.67846 

Hs.75431 

Hs.203349 

Hs.120769 

Hs.271530 

Hs.57776 



Hs.246759 
Hs.136414 
Hs.23616 
Hs.38002 

Hs.38260 

Hs.38365 

Hs.1 12619 

Hs.38991 

Hs.288941 

Hs.40098 

Hs.1 61566 

Hs.239727 

Hs.620 

Hs.42173 

Hs.42346 

Hs.123073 

Hs.42824 

Hs.43728 

Hs.624 

Hs.44276 

Hs.44298 

Hs.159235 

Hs.182575 

Hs.44532 

Hs.45743 

Hs.46320 

Hs.135188 

Hs.253690 

Hs.226568 

Hs.46677 

Hs.238936 
Hs.47584 



C6001909:giI704441|dbiIBAA18909.1| (D298 
C60)123B*^i|121715|spiP26697|GTA3.CHlCK 
Target Exon 

NM_021Q4B:Homo sapiens melanoma antigen, 
NM_005596*:Homo sapiens nuclear factor I 
chotesteryl ester transfer protein, plas 
Target Exon 

NrVL005365:Homo sapiens melanoma antigen, 
Target Exon 
Target Exon 

CY000047-:gi|1 1427234^fIXP_009399.1 1 z 
NMJ>31413*:Homo sapiens cat eye syndrome 
Target Exon 

C12000200^!|4557225|retlNP„Q00005.1| a) 
cytochrome c-1 

NM_002362:Homo sapiens melanoma antigen, 
C15000305:giJ3808122|gbIAAC69198.1| (AF0 
NrVL000179*:Homo sapiens mutS (E. coB) h 
Target Exon 

NMJ)03122*:Homo sapiens serine protease 
Target Exon 

immunoglobulin lambda locus 
gb:Homo sapiens mRNA for immunoglobulin 
Immunoglobulin heavy constant mu 
met proto-oncogene (hepatocyte growth fa 
major histocompatibilily complex, class 
Human L2-9 transcript of unrearranged im. 
gb:Human done 1A11 immunogloburin varta 
gb:Human nonspecific crossreacOng antig 
pregnancy specific beta-1 -glycoprotein 9 
carcinoembryonic antigen-related cell ad 
major histocompatibility complex, class 
tRNA isopentenylpyrophosphate transferas 
major histocompatibility complex, class 
gb:Human alpha satellite and satellite 3 
gb:Human parathyroid hormone-like protei 
gb:Human parathyroid hormone-related pep 
hypothetical protein MGC13170 
EST 

gb:ye53h05.s1 Soares fetal liver spleen 
ESTs 

leukocyte Immunoglobulin-^ receptor, 
gb:Human nonspecific crossreacting antig 
fibrinogen, gamma polypeptide 
Homo sapiens CDNAFU12149 lis, done MA 
gb:zn43eQ7.s1 Stratagene HeLa ceil s3 93 
gb:Homo sapiens ctg33 mRNA, partial sequ 
ESTs, Moderately similar to (38022 hypot 



gb:Homo sapiens mRNA for axonemal dynein 
hypothetical protein MGC1 2538 
UDP-G!cNAc;betaGai beta-1, 3-N-acetylgluc 
ESTs 

KIAA1355 protein 

hypothetical protein FU 1 1 100 

ubiquitin specific protease 18 

KJAA0125gene product 

ESTs, Moderately similar to PURKINJE CEL 

S100 calcium-binding protein A2 

Homo sapiens cDNA FU14866 lis, clone PL 

cysteine knot superfamily 1, BMP antagon 

ESTs 

desrrtocollin 2 

bullous pemphigoid antigen 1 (230/240kD) 
Homo sapiens cDNA RJ 10366 fis, clone NT 
calcineurin-binding protein calsarcin-1 
gb:xf05d05.x1 NCLCGAP_Bm35 Homosapien 
CDC2-reIated protein kinase 7 
hypothetical protein FU 107 1 8 
hypothetical protein 
tntarteukm 8 
homeobox C10 

mitochondrial ribosomal protein S17 
ESTs 

solute carrier family 15 (H777 transport 
diubiquitin 

adenosine A2b receptor 

Small profine-rich protein SPRK [human, 

ESTs 

ESTs 

ESTs, Moderately similar to AIU4_HUMAN A 
PRO2000 protein 

ESTs, Moderately similar to PC4259 ferri 
ESTs, Weakly similar b (defUne not ava 
potassium voitage-galed channel, delayed 
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1.00 
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1.74 
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3.91 
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15.00 
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3.09 
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1.33 


1.45 


1.46 


2.85 


8.61 


8.50 


226.37 


350.00 


1.01 


Z52 


20.25 


32.00 


0.75 


1.91 


38.15 


1114.00 


1.00 


1.00 


1.00 


1.00 


1.77 


1.10 


1.00 


1.00 


142.70 


135.00 


2.16 


18.00 


1.10 


1.57 


1.12 


2.85 


3.24 


15.38 


3.53 


3.68 


19,74 


73.00 


0.06 


8.25 


1.00 


26.00 


1.00 


25.00 


1.00 


75.00 


1.00 


1.00 


111.20 


228.00 


1.00 


28.00 


1.89 


1.31 


1.00 


1.00 


4.51 


5.00 


1.00 


28.00 


0.97 


1.14 


7.88 


3.83 
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TABLE 9C 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. "Dunham I. et al." refers to the publication entitled "The DNA 

sequence of human chromosome 22." Dunham I. et aL, Nature (1 999) 402:489-495. 
Strand: mtffcales DNA strand from which exons were predicted. 
NLposHion: Indicates nucleotide positions of predicted exons. 
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TABLE 10A: PotBnfial Therapsulic Diagnostic and Prognostic targets for Therapy of Lung Cancer and Non-malignant Lung Disease 

Table 2A shows about 307 genes up-regulated hi non-malignant lung disease relative to lung tumors and normal body tissues and/or down-regulated In lung tumors relative to 
normal lung and non-malignant tung disease. These genes were selected from about 59680 probesets on the Eos/AnVmetrix Hu03 Genechlp array. 

Table 108 show the accession numbers for those Pke/s lacking UnigenelD's for table 1 0A For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRMAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (OoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession* column. 

Table 10C show the genomic positioning for those Pkey's lacking Unigene ID'S and accession numbers in table 1GX For each predicted exon, we have fisted the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also listed. 



Pkey: \ Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

R1: Average of lung tumors (including squamous ceil carcinomas, adenocarcinomas, small cell carcinomas, granulomatous and carcinoid tumors) divided by the 
average of norma) lung samples 

R2: Average of non-mafignant lung disease samples (including bronchitis, emphysema, fibrosis, atelectasis, asthma) divided by the average of normal lung samples 
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TABLE 10B 



Pkey. Unique Eos probeset identifier number 
CAT number. Gene cluster number 
Accession: Genbank accession numbers 



Pkey CAT Number Accession 

408074 103684 1 R20723 AA263003 AA333976 AA334725 AA334151 AW965490 AA310513 A1810530 031302 AW134897 AA830127 AA046953 AI668930 
C06094AW104534 

411667 1253334 1 BE160198 AW935898T11520AW935930AW856073AWB61034 

413533 1375344 1 BE146973 BE146972 BE147042 BE147018 BE146783 BE147020 BE146781 BE147019 BE146766 BE147021 BE146952 BE146767 BE147044 
BE146797 BE146776 BE146985 BE146793 BE146768 BE146771 BE146954 BE146760 BE147048 BE147025 BE147030 

423387 22779 1 AJ01 2074 U11 087 L1 3288 X75299 L20295 AW630780 H1 4880 T28037AI 872991 R72136 AW449839 T81622 T79697 T29519 R94105 T83923 
R73300 A1797007 R73390 AA961010 H74168 A1669932 BE045543 AI808418 AI608912 AI806573 AW884084 AW872978 AW872985 AA565655 
A1022915 R50647 R73210 H45098 R46451 AW1 66269 T7 11 32 Al 264547 R52146 AI304920 R73391 AW884059 AW884085 H73241 T60038 
T79612 R73145 R50549 AI094557 AI668793 R72302 AI564365 W01 956 AA41 8962 W32571 R72840 H45409 R72085 R46356 R4675B 
AA508805 AA418798 T83751 R94072 T16182 AA928785 AA903896 

423698 23112 1 ' 292546 AA330586 AI570568 AW341487 A1827050 AW298668 AI792169 AI015693 AI733599 AI572251 AI672488 AW193262 AI244716 
A1864375 A1206100 M91 2444 AI&9365 Al64(fcM A 

430212 314437J AA469153 AI718503 AA469225 

436532 421802.1 AA721522 AW975443 T93070 

453531 97026J AA417940AA036735 T07025 

454741 1232559J BE154395 AW81 7959 BE1 54393 



TABLE 10C 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers. "Dunham I. et al." refers to the publication entitled "The DNA 

sequence of human chromosome 22." Dunham I. et aL, Nature (1 999) 402:489-495. 
Strand: Indicates ONA strand from which exons were predicted, 
NLposition: Indicates nucleotide positions of predicted exons. 
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TABLE 1 1 A: Genes Distinguishing Adenocarcinoma from Other Lung Diseases and Norma! Lung 

Table 1 1A shows about 34 genes unregulated In lung adenocarcinomas relative to other lung tumors, norwnafignant lung disease, and normal lung. These genes were selected 
from about 59680 probesets on the Eos/Aflymetrix Hu03 Genechip array. 

Table 1 1B show the accession numbers for those Pke/s lacking UnigeneJD's for table 1 1 A. For each probeset we have fisted the gene cluster number from which the 
oligonucleotides were designed. Gene dusters were complied using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Ctustaring and Alignment Tools (DoubteTwist, Oakland California), The Genbank accession numbers for sequences comprising each cluster are Fisted mine 
"Accession" coJumn. 

Table 11C show the genomic positioning for those Pke/s lacking Unigene ID's and accession numbers m table 11A. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also listed. 

Pkey: Unique Eos probeset identifier number 

ExAccru Exemplar Accession number, Genbank accession number 

UnlgenelD: Unigene number 

Unigene Title: Unigene gene title 

R1 : Average of lung tumors (including squamous cell carcinomas, adenocarcinomas, small cell carcinomas, granulomatous and carcinoid tumors) divided by the 
average of normal lung samples 

R2 Average of non-malignant lung disease samples (including bronchitis, emphysema, fibrosis, atelectasis, asthma) divided by the average of norma) lung samples 
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HS.1074 


giutamlne-fructose-6-phosphate transarrtn 


1 on 

I.UU 


ou.uu 


424502 


AF242388 


11-4 A ftrOC 

HS. 149585 


lengsin 


1 nn 
l.uu 


1 nn 

I.UU 


AtAKAA 

424044 


KA 0070 A 


ns.iowuo 


uopa OcCwOOxyiaSo \hi oiuauc L-airviiiu du 


1.00 


59.00 


424905 


NMJJ02497 


Hs. 153704 


N1MA (never in mitosis gene a}-related k 


91 0.C 

£1.00 


1 on 

I.UU 


424960 


dc245o80 


HS.lbo9o2 


o nucieouoase^UfOj 


1 0(1 

I.UU 


1.00 


425523 


AB007948 


u~ icon A A 
HS.1oo244 


WAAU4ry protein 


1 00 

I.UU 


35.00 


426230 


AA307019 


HS.24 ljyi) 


protease, serine, 1 (trypsin 1) 


1 no 

I.UU 


83.00 


427701 


AA411101 

fWf 1 1 1 If 1 


Hs 243888 

f id»fc7vOOv 


nuclear autoentigenic sperm protein (his 


7.41 


34.00 


428585 


AB007663 


HS.185140 


KIAA0403 protein 


1.00 


6.00 


428758 


AA433988 


Hs.98502 


hypothetical protein FU14303 


1.06 


1.13 


429170 


NM.001394 


Hs^359 


dual specificity phosphatase 4 


16.18 


105.00 


429263 


AA019004 


Hs.198396 


ATP-Wnding cassette, sub-family A (ABC1 


1.07 


1.00 


429610 


AB024937 


Hs.211092 


LUNX protein; PLUNC (palate lung and nas 


1.59 


1.69 


430508 


AI015435 


Hs.104637 


ESTs 


4.75 


7.27 


430985 


AA490232 


Hs.27323 


ESTs, Weakly similar b I78885 serine/th 


0.94 


1.28 


431548 


AI834273 


Hs.9711 


novel protein 


5.66 


15.00 


431566 


AF176012 


Hs^60720 


J domain containing protein 1 


49.76 


37.00 


431986 


AA536130 


Hs.149018 


Novel human gene mapping to chomosome 20 


1.19 


1.47 


432375 


BE536069 


Hs.2962 


S 100 calcium-binding protein P 


1.65 


1.06 


432677 


NM_004482 


Hs.278611 


UOP-N-acer7kIpha-0^alactosamine:polyp 


1.00 


48.00 


433556 


W56321 


Hs.1 11460 


caicium/cafrnoduIirMiependent protein kin 


1.00 


19.00 


433819 


AW511097 


Hs.112765 


ESTs 


171 


6.00 


434001 


AW950905 


Hs.3697 


serine (or cysteine) proteinase mhlbito 


29.31 


72.00 


434424 


A1811202 


Hs.325335 


Homo sapiens cDNA: FU23523 fis, clone I 


1.00 


64.00 


434792 


AA649253 


Hs.132458 


ESTs 


8.52 


44.00 


436217 


T53925 


Hs.107 


fibrmogen-like 1 


57.97 


31.00 


436749 


AA584890 


Hs.5302 


lectin, galactoside-binding, soluble, 4 


1.10 


1.41 


436972 


AA264679 


Hs^5640 


daudin3 


1.59 


1.46 


437866 


AA156781 




metafiothfoneJn 1E (functional} 


3.62 


101.00 


437935 


AW939591 


Hs.5940 


mucin 13, epithelial transmembrane 


1.60 


1.39 


438915 


AA280174 


Hs.285681 


WiiBams-Beuren syndrome chromosome reg) 


1.00 


1.00 


439451 


AF086270 


H^278554 


heterochromaSn-Gke protein 1 


23.28 


5Z00 
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439759 


AL359055 


Hs.67709 


Homo sapiens mRNA fufl length insert cDN 


1.00 


21.00 


441031 


AM10684 


Hs.7645 


fibrinogen, 6 beta polypeptide 


1.41 


99.00 


441377 


BE21B239 


Hs.202656 


ESTs 


22.03 


1.00 


443614 


AV655386 


Hs.7645 


fibrinogen, B beta polypeptide 


1.00 


16.00 


443813 


AA876372 


Hs.93961 


Homo sapiens mRNA; cDNA OKFZp657O095 (fr 


1.20 


1.99 


443991 


NM.002250 


Hs.10082 


potassium intermediate/small conductance 


571 


6.87 


444670 


H58373 


Hs.332938 


hypothetical protein MGC5370 


1.98 


38.00 


444931 


AV6520S6 


Hs.75113 


general transcription factor f!IA 


1.00 


54.00 


448102 


AW168067 


Hs.317694 


ESTs 


1.00 


1.00 


446163 


AA026B80 


Hs.25252 


Homo sapiens cDNA FU 13603 fis, done PL 


1.00 


36.00 


446469 


BE094848 


Hs.15113 


homogentisate 1,2-dioxygenase {homogentl 


1.00 


11.00 


447388 


AW630534 


Hs.76277 


Homo sapiens, clone MG&9381, mRNA, comp 


1.24 


1.16 


447532 


AK000614 


Hs,18791 


hypothetical protein FU20607 


1.23 


1.63 


448243 


AW369771 


Hs.52620 


rnlegrin, beta 8 


15.84 


1.00 


448844 


A15B1519 


Hs.177164 


ESTs 


1.00 


31.00 


449444 


AW818436 


Hs.23590 


solute carrier family 16 (monocarboxyfic 


1.00 


83.00 


451807 


W52854 




hypothetical protein FU23293 similar to 


1.55 


35.00 


452689 


F33868 


Hs.284176 


transferrin 


1.54 


1 AA 

1.44 


453392 


U23762 


Hs.32964 


SRY {sex determining region Y)-box 1 1 


• 1.00 


16.00 


453464 


AI884911 


Hs.32989 


receptor (calcitonin) activity modifying 


1.55 


2.45 


453735 


AI066629 


Hs.125073 


ESTs 


1.01 


1.30 



TABLE 11B 

Pkey: Unique Eos probeset identifier number 
CAT number: Gene cluster number 
Accession: Genbank accession numbers 



CAT Number 
11995.1 



419502 18535.1 



421582 2041J 



437866 44433.2 



451807 8865.1 



BE068889 BE068882 AF04431 1 AF017256 NMJ03087 AF037207 AF010126 AA633976 AA872836 BE298825 BE299889 A1016464 AI684600 
AI936527 AA804675 AA394097 AI139933 AA946608 BE171313 AA722407 AA293803 AI468480 AA056035 AA055968 AW796957 AI637713 
AA41 0737 H49348 AA486472 AA41 1 094 AA235594 AA402624 AA443638 AW452137 AA421 708 AW26521 1 AI493266 AA365132 AW966044 
AU076704 T74854 T74860 T72098 T73265 T73873 T69160 T74658 T58786 T60385 T7341 0 T68781 T67845 T67593 T73952 T67864 T60630 
T68367 T68401 T53959 T72360 T72099 T60377 T58961 T71712 T72821 T64738 T74645 T72037 T68688 T72063 T73258 T72826 T64242 
T6B220 T74673 T71800 T68355 T61227 T62738 T69317 T53850 T64692 T73768 T73962 T73382 T6B914 T70975 T73400 T60631 T73277 
T73203 T70498 T61409 T5B925 NMJJ0O5O8 M64982 T68301 T73729 T69445 T60424 T67922 T67736 T68716 T67755 T74765 T73819 T58719 
T74756 T60477 T74863 T61 109 T68329 T58850 T71857 T73425 T53736 T68607 T58898 T64309 T72031 T72079 T64305 T71908 T681 07 
T71916 T73787 T56035 T64425 T71870 T60476 T61376 T67820 T71895 T41006 T69441 T68170 T74617 T71958 T69440 T61875 R06796 
H48353 T71914 T53939 T641 21 AA693996 T72525 T67779 T68078 AA01 1465 AA345378 AV654647 AV654272 AV656001 AI06474O T82B97 
N33594 M344542 AW805054 A1207457 T61743 AA026737 H94389 AA382695 AA918409 T68044 S82092 T39959 AI017721 AA312395 
AA312919 T40156 H66239 AV652989 H38728 R98521 AV655200 R95790 W03250 W00913 AA344136 AV660126 R97923 AA343596 
AW470774 AV651256 N544 17 AA812862 AW182929 Ah 1 1 1 92 H61463 H72060 AA344503 H3B639 A/27751 1 AV661 108 AI207625 T4781 0 
AA235252 T27853 T47778 R95746 H70620 AA701463 AW827166 R98475 C20925 AV657287 T71959 T71313 T73920 T73333 T61618 T69293 
T69283 T73931 T721 78 T72456 AV645639 AV653476 n2957 T72300 T58906 T71457 T70494 T72956 T70495 T68267 T74407 T85778 
AA344726 T27854 T74485 T74101 T73868 T71518 T72304 AA343853 T73909 T68070 T72065 H72149 T73493 T73495 AV645993 R02293 
T70475 T64751 AA344441 AA343657 AA345732 AA344328 A11 10639 AA344603 AF063513 T6469S T68516 T72223 T60507 T67633 R29500 
T72517 R02292 T60599 T69206 T70452 T74677 R29366 T61277 T74914 T60352 R29675 T74843 AV645792 AA344408 T69197 T72Q57 
T69368 T69358 T68256 AV650429 T73341 T61702T74598 T40095 K02272 T40106 AA343045AA341908 AA341 907 AA342807AA34 1964 
T53747 T72042 T62764 AI064899 AA343060 T67832 T72440 T71770 T68091 T69108 T72449 T69167 T71289 T68251 AV654844 T64375 
AA345234 T67598 AA011414 T68036 H48262 AI207557 T68219 W86031 T69081 T64232 R93196 T62136 AV650539 H67459 T72978 
AA344583 T60362 H58121 T95711 T72803 T68055 T71715 R29036T72793 T69122 T64595 T62888 T69139 T68291 T64652 T67971 T46862 
AA693592 AJ248502 R29454 T64764 T57001 T73052 T71429 T51 1 76 T58866 AV655414 H90426 AA342489 T73666 T67848 T72512 T53835 
T67837 T73317 T74273 T69420 T68245 T74380 T67862 T74474 T56068 

A1910275 X00474 X52003 X05030 NM 003225 AA314326 AA308400 AA506787 AA314825 AI571948 AA507595 AA614579 AA587613 R83818 
AA568312 AA614409 AA307578 A1925552 AW950155 AI910083 M12075 BE074052 AW004668 AA578674 AA582084 BE074053 BE074126 
BE074140 AA514776 AA588034 BE074051 BE074068 AW009769 AW050590 AA658276 R55389 AI001 051 AW050700 AW750216 AA614539 
BE074045 A1307407 AW602303 BE073575 At202532 AA524242 AI970839 A1909751 BE076078 AI909749 R55292 
AA156781 AW293839 U52054 AA024963 AA778446 BE073977 AW444904 AW602574 BE164040 BE1 6401 2 BE1 63972 BE163974 BE1 63992 
AA837481 AW468444 BE165091 AW468002AA687333 AA81 1830 AA58 1806 AI866686 AI572 124 AA043777AA040926 D20160AI536733 
AA812489 AW874142 AI4718B3 W84421 AA156850 

W52854 AL117600 BE208116 BE208432 BE205239 BE0B2291 AW953423 AA351619 BE180648 BE140560W60Q80 AA865478 N90291 
AW450652 AW449519 AA993634 AI805539 AA351618 AW449522 AI827626 AA904788 AA380381 AAB86045 AA774409 BE003229 241756 



TABLE 11C 



Pkey: Unique number corresponding to an Eos probeset 

Ret: Sequence source. The 7 digit numbers In this column are Genbank Identifier (Gl) numbers. 'Dunham I. et a!." refers to the publication entitled The DNA 

sequence of human chromosome 22." Dunham I. et at, Nature (1 999) 402:489495. 
Strand: Indicates DNA strand from which exons were predicted. 
NLposiiion: Indicates nucleotide positions of predicted exons. 



Pkey 
403329 



Ref 

8516120 



Strand NLpositton 
Plus 9645096598 
63448-63554 
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TABLE 12A: Genes Distinguishing Squamous Cell Carcinoma from Other Lung Diseases and Norma) Lung 

Table 12A shows about 72 genes unregulated in squamous cell carcinomas of the lung relative to other rung tumors, non-malignant lung disease, and normal lung. These genes 
were selected from about 59680 probesets on the Eos/Aflymetrix Hu03 Genechip array. 

Table 12B show the accession numbers for those Pke/s lacking UnfgenelD's for table 12A. For each probeset we have fisted the gene cluster number from which the 
oligonucleotides were designed. Gene dusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 
"Accession* column. 

Table 12C show the genomic positioning for those Pke/s lacking Unlgene ID'S and accession numbers in table 1 2A. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also fisted 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unrgene number 

Unigene Title: Unrgene gene tiile 



R1: 

R2: 

Pkey 
400289 
400666 
401780 
401781 
401785 
401994 
402075 
404996 
407839 
408000 
408522 
410561 
415091 
415817 
416658 
417034 
417366 
418663 
418678 
419121 
420783 
421773 
421948 
421978 
422158 
422440 
423634 
423725 
423738 
424012 
424046 
424098 
424834 
425650 
427099 
427335 
428182 
428645 
428748 
429259 
429538 
429903 
430486 
430890 
431009 
431646 
433091 
434360 
434880 
435505 
435793 
436511 
438403 
439285 
439608 
439670 
439706 
440325 
'441525 
443162 
444378 



Average of lung tumors (including squamous cell carcinomas, adenocarcinomas, small cell carcinomas, granulomatous and carcinoid tumors) divided by the 
average of normal lung samples 

Average of non-malignant lung disease samples (including bronchitis, emphysema, fibrosis, atelectasis, asthma) divided by the average of normal lung samples 

ExAccn UnigenelD Unigene Tifla 

X07620 Hs.2258 matrix metafloproteinase 10 (stromelysin 

NM_002425:Homo sapiens matrix metallopro 

NMJ>05557*:Homo sapiens keratin 16 (foca 

Target Exon 

NfvL002275*:Homo sapiens keratin 15 (KRT1 

. . Target Exon 

ENSP00000251056*:Plasma membrane calcium 
Target Exon 

AA045144 Hs.161566 ESTs 

L1 1690 Hs.620 bullous pemphigoid antigen 1 (2307240kD) 

AI541214 Hs.46320 Small proline-rich protein SPRK [human, 

BE540255 Hs.6994 Homo sapiens cDNA: FU22044 fis. clone H 

AL044872 Hs.77910 3-hydroxy-3-methylglutaryl-Coenzyme A sy 

U88967 Hs.78867 protein tyrosine phosphatase, receptoM 

U03272 Hs.79432 fibrillin 2 (congenital contracture! ara 

NM.006183 Hs.80962 neurotensin 

BE1 85289 Hs, 1 076 small proline-rich protein 1 B (comlfin) 

AK001100 Hs.41690 desmocollin3 

NM.001327 Hs.87225 cancer/testis antigen 

AA374372 Ks.89626 parathyroid hormone-fike hormone 

AI659838 Hs.99923 lectin, galactoside-blnding, soluble, 7 

W69233 Hs.1 12457 ESTs 

L42583 Hs.334309 keratin 6A 

AJ243662 Hs.1 10196 NICE-1 protein 

L10343 Hs.1 12341 protease inhibitor 3, skin-derived (SKAL 

NM_0048 1 2 Hs. 1 1 6724 aldo-kelo reductase family 1 , member B1 0 

AW959908 Hs.1 690 heparin-binding growth factor binding pr 

AJ403108 Hs.132127 hypothetical protein LOC57B22 

AB0021 34 Hs. 1 321 95 airway trypsin-like protease 

AW36B377 Hs. 1 37569 tumor protein 63 kDa with strong homolog 

AF027866 Hs. 1 38202 serine (or cysteine) proteinase Inhibito 

AF077374 Hs.1 39322 small proline-rich protein 3 

AK001432 Hs.1 53408 Homo sapiens cDNA FU10570 fis, clone NT 

NrA.001 944 Hs.1 925 desmoglein 3 (pemphigus vulgaris antigen 

AB032953 Hs.1 73560 odd Oz/terwn homolog 2 (DrosophUa, rnous 

AA448542 Hs.251677 G antigen 7B 

BE386042 Hs.293317 ESTs, Weakly similar to GGC1.HUMAN G ANT 

AA431400 Hs.98729 ESTs, Weakly similar to 2017205A dihydro 

AW593206 Hs.98785 Ksp37 protein 

AA420450 Hs.292911 ESTs, Highly similar to S6071 2 band-S-pr 

BE182592 Hs.1 1261 small proline-rich protein 2A 

AL134197 Hs.93597 cydln-dependent kinase 5, regulatory su 

BE062109 Hs.241551 chloride channel, calcium activated, fam 

X54232 Hs.2699 glypican 1 

BE149762 Hs.48956 gap junction protein, beta 6 (connexin 3 

BE019924 Hs.271580 uroplakin 1B 

Y1 2642 Hs.3 1 85 lymphocyte antigen 6 complex, locus D 

AW015415 Hs.127780 ESTs 

U02388 Hs.1 01 cytochrome P450, subfamily IVF, polypept 

AF200492 Hs.21 1238 InterieukM homolog 1 

AB037734 Hs.4993 WAA1313 protein 

AA721252 Hs.291502 ESTs 

AA806607 Hs.292206 ESTs 

AL133916 hypothetical protein FU20093 

W79123 Hs.58561 G protein-coupled receptor 87 

AF088076 Hs.59507 ESTs, WeaWy similar to AC004858 3 U1 sm 

AW872527 Hs.59761 ESTs, Weakly similar to DAP1.HUMAN DEATH 

NM.003812 Hs.7164 a dismtegrin and metaDoproteinase doma 

AW241867 Hs.127728 ESTs 

T49951 Hs.9029 DKFZP434G032 protein 

R41339 Hs.1 2569 ESTs 



R1 


R2 


132.45 


4.00 


3.26 


3.22 


26.47 


10.50 


10.33 


4.61 


4.13 


2.70 


61.84 


47.00 


1.00 


1.00 


1.00 


1.00 


173.91 


108.00 


151.17 


6.00 


1.98 


1.24 


10.04 


1.00 


1.00 


30.00 


24.30 


1.00 


53.29 


51.00 


1.00 


1.00 


8.97 


3.27 


11Z17 


19.00 


1.18 


1.10 


1.00 


1.00 


3.04 


1.25 


1.12 


1.14 


51.83 


20.25 


1.01 


0.91 


2.37 


1.10 


47.53 


32.00 


76.02 


1.00 


4.20 


1.00 


10.14 


51.00 


233.42 


68.00 


1,00 


1.00 


137.82 


54.00 


56.19 


12.00 


33.45 


1.00 


4.24 


17,00 


51.83 


4.00 


1.00 


1.00 


1.00 


16.00 


1.00 


87.00 


2.01 


1.18 


4.43 


2.90 


11.80 


1.00 


12.28 


41.00 


1.58 


1.40 


60.25 


28.00 


4.49 


151 


1.20 


1.09 


40.98 


27.00 


1.00 


1.00 


1.00 


38.00 


23.68 


42.00 


16.76 


14.00 


1.00 


1.00 


46.23 


139.00 


33.61 


1.00 


1.00 


1.00 


86.55 


11.00 


62.88 


147.00 


1.53 


1.42 


31.11 


38.00 


1.00 


1.00 
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446Z92 


Aruol49f 


LU 97QCQO 


Rh tvnfl C flfWCQOroteni 


447D7B 


AVvooo/*/ 


Ue QQ1A 


ESTs 


447342 


All 99200 




Homo sapiens, Similar to RIKEN cONA 2010 


449003 


X76342 


U» TOO 


alcohol dehydrogenase 7 (class IV), mu o 


449101 


AA2D304/ 


Up 0?fl1R 


G protein-coupled receptor 


450832 


AWa70o02 


HS. 105421 


ESTs 


452240 


AICQ44 AT 


Up CIW 

nS.OlAtt 


ESTs 


453317 


NM 002277 


Hs.41696 


ksraiin, hair, acidic, 1 


453830 


AA534295 


Hs.20953 


ESTs 


454098 


W27953 


Hs.292911 


ESTs, Highly similar to S60712 band^pr 


455601 


AI358680 


Hs.816 


SRY (sex determining region YJ-txa 2 



1.55 


1.26 


47.24 


24.00 


28.63 


t.ao 


1.00 


1.00 


2.58 


27.00 


25.17 


36.00 


13*42 


1.00 


1.19 


1.27 


24.92 


25.00 


1.26 


1.11 


206i11 


1.00 



PCT/US02/12476 



TABLE 12B 

Pkey: Unique Eos probeset identifier number 
CAT number Gene cluster number 
Accession: Genbank accession numbers 

Pkey CAT Number Accession 

439285 47055 1 AU33916 N79113 AF086101 N76721 AW950828AA364013 AW9556B4 A1346341 AI867454 N54784 AI655270 AI421 279 AW01 4882 

AA775552 N62351 N59253 AA626243 AI341407 BE 175639 AA455968 AI35891 8 AA457077 



TABLE 12C 



Pkey: Unique number corresponding to an Eos probeset 
Ref: Sequence source. The 7 digit numbers in this column are Genbank Identifier (GI) numbers. "Ounham I. et at 

sequence of human chromosome 22/ Dunham I et a!., Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which axons were predicted. 
NU>osifon: Indicates nucleotide positions of predicted exons. 



refers to the publication entitled "The ONA 



Pkey 

400566 

401780 

401781 

401785 

401994 

402075 

404996 



Ref 

8118496 
7249190 
7249190 
7249190 
4153858 
8117407 



Strand NLpostoon 

Pius 17982.18115,20297-20456 

Minus 28397-28617,28920-29045,29135.29296,29411.29567,29705-29787,30224-30573 

Minus 83215^3435,83531-83656,83740^3901,84237-84393,84955^5037,86290-86814 

Minus 165776-165996,166189-166314,166408-166569,167112-167268.167387-167469,168634-168942 

Minus 42904-43124,43211-43336,44607-44763,45199-45281,46337-46732 

Plus 121907-122035,122804-122921,124019-124161,124455-124610,125672-126076 

Plus 37999-38145,38652-38998,39727-39872,40557-40674.42351-42450 
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TABLE 13A Genes Distinguishing ton-Malignant Lung Disease from Lung Tumors and Norma] lung 

Table 13A shows about 23 genes unregulated In non-malignant lung disease relative to lung tumors and normal lung. These genes were selected from about 59680 probesets on 
the Eos/Affymetrix Hu03 Genechip array. 

Table 13B show the accession numbers tor those Pke/s lacking UnigenelD's tor table 13A. For each probeset we have listed the gene cluster number from which (he 
oligonucleotides were designed. Gene clusters were complied using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed In the 
•Accession* column. 

Table 13C show the genomic positioning for those Pke/s lacking Unigene ID'S and accession numbers in table 13A. For each predicted exon, we have Osted the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also listed 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigeneiD: Unigene number 

Unigene Title: Unigene gene title 

R1 : Average of lung tumors (including squamous cell carcinomas, adenocarcinomas, small cell carcinomas, granulomatous and carcinoid tumors) divided by the 
average of normal lung samples 

R2: Average of non-maiignant lung disease samples (including bronchitis, emphysema, fibrosis, atelectasis, asthma) divided by the average of normal lung samples 



Pkey 


ExAccn 


UnigeneiD 


Unigene Title 


R1 


R2 


40B562 


AI436323 


Hs.31141 


Homo sapiens mRNA for KIAA1568 protein, 


1.00 


230.00 


409031 


AA376836 


Hs.76728 


ESTs 


1.00 


128.00 


412372 


R65998 


Hs.285243 


hypothetical protein FU2208 


1.00 


173.00 


415910 


U20350 


Hs.78913 


chemokine (C-X3-C) receptor 1 


1.00 


145.00 


417511 


AL049176 


Hs.82223 


chordin-like 


1.00 


179.00 


418819 


AA228776 


Hs.191721 


ESTs 


1.00 


140.00 


422060 


R20893 


Hs.325823 


ESTs, Moderately similar to ALU5.HUMAN A 


1.00 


156.00 


424585 


M464840 


Hs.131987 


ESTs 


1.00 


167.00 


426753 


T89832 


Hs.170278 


ESTs 


1.00 


141.00 


429496 


AA453800 


Hs.192793 


ESTs 


1.00 


138.00 


430719 


AA488988 


Hs.293796 


ESTs 


1.00 


133.00 


431089 


BE041395 




ESTs, Weakly similar to unknown protein 


23.32 


941.00 


431385 


BE178535 


Hs.11090 


membrane-spanning 4-domains, subfamily A 


1.00 


157.00 


431728 


NMJ07351 


Hs.268107 


muttlmerln 


1.00 


157.00 


436532 


M721522 




gb:nv54h12.r1 NCLCGAP_Ew1 Homo sapiens 


1.00 


218.00 


437960 


AI659586 


Hs.222194 


ESTs 


1.00 


147.00 


438202 


AW1 69287 


Hs.22588 


ESTs 


1.00 


141.00 


441499 


AW298235 


Hs.101689 


ESTs 


1.00 


167.00 


444513 


AL1 20214 


Hs.7117 


glutamate receptor, ionotroplc, AMPA 1 


1.00 


151.00 


448253 


H25899 


Hs.201591 


ESTs 


1.00 


141.00 


453636 


R67837 


Hs.169872 


ESTs 


1.00 


116.00 


458332 


AI000341 


Hs.220491 


ESTs 


.1.00 


192.00 


459587 


AA031956 




gb:zk15e04.s1 Soares_pregnanLuterus_NbH 


1.00 


154.00 



TABLE 13B 

Pkey: Unique Eos probeset identifier number 
CAT number. Gene cluster number 

Accession: Genbank accession numbers 

Pkey CAT Number Accession 

431089 327825J BE041395 AA491826 AA621946 AA715980 AA666102 

436532 421802,1 AA721522AW975443 T93070 



TABLE 13C 



Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers In this column are Genbank Identifier (Gl) numbers. "Dunham L et a).* refers to the publication entitled The DNA 

sequence of human chromosome 22." Dunham I. et al, Nature (1 999) 402:489-495. 

Strand: Indicates DNA strand from which axons were predicted. 

NLposltion: Indicates nucleotide positions of predicted exons. 

Pkey Ref . Strand . NLposition 

402)75 8117407 Plus 121907-122035,122804.122921,124019-124161,124455-124610,125672-126076 
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WO 02/086443 PCT/US02/12476 
TABLE 14A: Preferred UuTity and Subcellular LocaQzaton for Potential Lung Disease Targets 

Table 14A shows the subcellular localization and preferred utility for the genes appearing in Tables 9A and 1QA. mAb symbolizes monoclonal antibody, flag symbolizes 
diagnostic s.m. symbolizes small molecule, and CTL symbolizes cytotoxic lymphocyfic ligand. These genes were selected from 59680 probesets on the EosVAflymefrix Hu03 
GenecNp array. 

Table 14B show the accession numbers tor those Pke/s lacking UntgeneiD's for table 14A. For each probeset we have fisted the gene cluster number from which the 
oQgonucteoddss were designed. Gene dusters were complied using sequences derived from Genbank £STs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed In the 
•Accession' column. 

Table HC show the genomic positioning for those Pke/s lacking Unlgene ID'S and accession numbers in table 14A For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also fisted 



Pkey: Unique Eos probeset Identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelO: Unlgene number 

Unigene Title: Unigene gene title 

Pref.Ufility: Preferred IMty 

Pred.Loc: Predicted subcellular localization 



Pkey 


ExAccn 


■ unigeneiu 


Unigene Title 


Pref Utility 


PipH 1 nr 


4Uuzo? 


Y/Y7QOA 


Lie OOCQ 


matrix metalloproteinase 10(strome)ysin 


mAb & diag & s.m. 


BAl/avowUJaJ 


AUUoUo 


A/U42/00 


Ue 7Q-11R 

ns./yi jo 


UV-1 protein, estrogen regulated 


mAb 


plasma membrane 


402075 






ENSP00000251056*:Plasma membrane calcium mAb & diag 




AtVTQi i 




nS.4UUUU 


cysteine knot superfamDy 1, BMP antagon 


diag 






ruu/o/ 


i-lc MA 


interieuxin 8 


diag 




400790 


AW50UZ27 


HS.4f 000 


neurotrophic tyrosine kinase, receptor, 


mAb & s.m. 


nhrma momhrano 
piddllld MIDIIlUIOllp 


4U09U0 


QC1QG017 


nS./DUOZZ 


serine/threonine kinase 15 


s.m. 


cytoplasm 


409041 


AblXwU/O 


il cnnQ4 
nS.oUUOl 


Hypothetical protein, XP_051o60 (KIAA1 19 


CTL & diag 








Ue *H09ftfl 

nS. nzzWJ 


XAGt-1 protein 


CTL 


nuclear 


403420 


74 CARD 
£15008 


l_U CJI4C4 

US. 54451 


tamlnln, gamma 2 (nicein (lOOkD), kalini 


diag 


ScviclcQ 


4U95JZ 


W/4UU1 


Ue CC97Q 


serine (or cysteine) proteinase inhibilo 


diag 




4U9/bf 


kiii nn4Qno 
NM_UU 1 oau 


Ue 19111 A 


cystatjn SN 


diag 


OAUd^BllUldl 


4U90:M 


AW24/USU 


Ue C71H1 


minichromosome maintenance deficient (S. 


CTL 


nuclear 


409956 


AWlUooM 


Up 717 


Innibin, beta A (activm A, acuvm AB a 


diag 


ovt rar a 11 1 liar 


410001 


AB041UOD 


m. C7774 


kailikrein 11 


diag 


ovtranolli liar 


AAftA(\~7 

410407 


Abbojy 


rS.OoZO/ 


carbonic annydrase IX 


mAb & s.ra 


pidsllld IliolllUlailo 


410418 


D31382 


HS.D3J25 


transmembrane protease, serine 4 


mAb & diag & s.m. 


piaSina iiicmurdno 


AAtAAt\ 

41 21 40 


A A010CQ4 

AA219091 


Ue 71ft9K 

nS.7J025 


RAB6 Interacting, klnesin-like (rabwnes 


s.m. 




A* T710 

412719 


AWUlbDlU 


Ue QIC 

nS.010 


ESTs 


Sin. 




A A All A 

414774 


XD2419 


l_u 7707 J 
HS.77274 


plasminogen activator, urokinase 


diag 


avlra^alli liar 


414883 


AA926960 




CDC28 protein kinase 1 


s.m 




415138 


C1B35o 


HS.295944 


tissue factor pathway inhibitor 2 


CTL & diag 


avlrar all 1 1 1 ar 
BAUaCcllUlal 


415009 


mm nncnoc 
NM_00502o 


Lin 7QC0Q 

rlS.7o5o9 


serine (or cysteine) proteinase inhibito 


mAo & diag & s.m. 




415817 


1 IOOQC7 

Uoo9o7 


nS.7ooo7 


protein tyrosine phosphatase, receptor-t 


mAb & s.ra 


p loo i rid i nuinui una 


Ai CCCQ 
410000 


UUO//2 


Utt 7QA11 

nS./94oZ 


fibrillin 2 (congenital contracture) are 


diag 


extracellular 


A A VMA 

417034 


mm nnc4Qi 
NMJJUolod 


nS.oOsOZ 


neurotensin 


diag 


a vtrar ollt 1 1 ar 

eAUaceiiuLdx 


4170/9 


1 icccon 
U05590 


HS.ollvW 


interleukin 1 receptor antagonist 


diag 


tJAlJaCollLlidJ 


417308 


H6Q72Q 


n„ O4OO0 

HS.Olo92 


KIAA0101 gene product 


s.m. 


mitochondrial 


AA 7*3QQ 
417309 


DC20U904 


MS.OZU40 


midkine (neurits growth-promoting factor 


mAb & diag 


ocwlcltnJ 


AA1AM 

41 f40J 


DCZ/UIOO 


Ue fill Ifl 


5T4 oncofetal trophobtast glycoprotein 


mAb 


nta^ma mAfnhrflnA 

UtddlllO 1 1 ICI I UN Ol io 


417933 


aU/oUo ; - 


-nS.oZUOZ 


thymtdylate synthetase 


s.m. 


nnHrtntacmlp ra)!nitiim 

ollUUptdSlIUu iCtUfUlUUI 


A A QA7Q 


IIO.QOAC 


nS. il /4 


cyclin-dependent kinase inhibitor 2A (me 


s.m. 


LjUJyioolll 


1IOOUD 






fl rirn)oin_mitn1pH rprnntnr 

^ fit viOHI^UUplCu ICvOpiUI 


mAh Aim 


plasma membrane 


A A QC7Q 

4l0OfO 


MM AH4Q07 


Ue 1R717Q 


cancer/tesOs antigen (NY-ESO-1) 


CTL 


rulnnlat;mif* 
UjlUpiOOilHU 


419l£l 


A A 17X170 

AAJ/40/Z 


Ue RQROfi 

ns.oyozo 


parathyroid hormone-like hormone 


diag 






(\W1_UUZ040 


ns.oyooo 


protein tyrosine phosphatase, receptor t 


mAb & s.m. 


niacma momhranfl 
pioolild iiiQiiiuiaiiD 


419183 


U60669 


Hs.89663 


cytochrome P450, subfamily XXIV (vitamin 


CTL&sjn. 


mitochondrial 


419216 


AU076718 


Hs.164021 


small Inducible cytokine subfamily B (Cy 


diag 


secreted 

plasma membrane 


419235 


AW470411 


Hs.288433 


neurotrimin 


mAb & diag 


419452 


U33635 


Hs.90572 


PTK7 protein tyrosine kinase 7 


mAb & s.ra 


plasma membrane 


419556 


U29615 


Hs.91093 


chitinase 1 (chitotriosidase) 


mAb&diag 


extracellular* 


420610 


A1683183 


Hs.99346 


distal-less homeo box 5 


CTL 


nuclear 


421110 


AJ250717 


Hs.1355 


cathepsin E 


sm&olag 


extracellular 


421379 


Y15221 


Hs.103932 


small Inducible cytokine subfamily B (Cy 


diag 


secreted 


421474 


U76362 


Hs.104637 


solute carrier family 1 (glutamate trans 


mAb & s.m. 


plasma membrane 


421552 


AF026692 


Hs.105700 


secreted frizzled-reiated protein 4 


diag 


secreted 


421753 


BE314828 


Hs.107911 


ATP-blnding cassette, sub-family B (MOW 


mAb & s.m. 


plasma membrane 


421817 


AF146074 


Hs.108660 


ATP-binding cassette, sub-family C (CFTR 


mAb & s.m. 


plasma membrane 


422109 


S73265 


Hs.1473 


gastrin-releastng peptide 


diag 


secreted 


422158 


L10343 


Hs.112341 


protease inhibitor 3, skin-derived (SKAL 


diag 


secreted 


422282 


AF019225 


Hs.1 14309 


apollpo protein L 


diag 


422283 


AW411307 


Hs.11431t 


COC45 (celt division cycle 45, Sxerevis 
prostate differentiation factor 


s.m. 


nuclear 


422424 


A1186431 


Hs.296638 


diag 


extracellular 


422765 


AW409701 


Hs.1578 


baculovtral (AP repeat-containing 5 (sur 


s.m. 


cytoplasm 


422809 


AK001379 


Hs.121028 


hypolheOcal protein FU10549 


s.m. ' 


nuclear 


422867 


L32137 


Ks.1564 


cartilage oUgomeric matrix protein (pse 


diag 


extracellular 


422956 


BE545072 


Hs.1 22579 


ECT2 protein (Epithelial cell transform! 


CTL&sjn. 




423634 


AW959908 


Hs.1690 


heparirvbtnding growth factor binding pr 


mAb&diag &s.m 




423573 


BE003054 


Hs.1695 


matrix metafloprotelnase 12 (macrophage 


secreted 


423961 


D13666 


Hs.136348 


perfestln (OSF-2os) 

serine (or cysteine) proteinase inhibito 


mAb&diag 


extracellular 


424046 


AF027866 


Hs.1 38202 


diag 


secreted 


424381 


AA285249 


Hs.146329 


protein kinase Oik2 


s.m 


nuclear 
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WO 02/086443 


424502 


AF242388 


Hs.149585 


424503 


NM.002205 


Hs.149509 


424697 


J05070 


Hs.151738 


425247 


NMJJ05940 


Hs.155324 


425322 


U 53630 


Hs.155637 


425650 


NM.001944 


Hs.1925 


425734 


AF056209 


Hs.159396 


425776 


U25128 


Hs.159499 


425852 


AK001504 


Hs.159551 


426215 


AW963419 


Hs. 155223 


426427 


M86699 


Hs.169840 


426514 


BE616633 


Hs.170195 


427335 


AA448542 


Hs.251677 


427747 


AW411425 


Hs.180855 


428242 


H55709 


Hs.2250 


428330 


122524 


Ks.2256 


428450 


NM 014791 


Hs.184339 


428479 


Y00272 


Hs.334562 


428484 


AF104032 


Hs.184601 


428654 


AK001666 


Hs.189095 


428698 


AA852773 


Hs.334838 


428748 


AW593206 


Hs.98785 


428758 


AA433988 


Hs.98502 


428969 


AF120274 


Hs.194689 


429211 


AF052693 


Hs.198249 


429263 


AA019004 


Hs.198396 


429547 


AW009166 


Hs.99376 


429610 


AB024937 


Hs.211092 


429903 


AL134197 


Hs.93597 


430486 


BE062109 


Hs.241551 


431462 


AW583672 


Hs.256311 


431515 


NMJ12152 


Hs.258583 


431846 


BE019924 


Hs.271580 


431958 


X63629 


Hs.2877 


432201 


AI538613 • 


Hs.298241 


433001 


AF217513 


Hs.279905 


435505 


AF200492 


Hs.211238 


43648V 


AA379597 


Hs.5199 


437016 


AU076916 


Hs.5398 


437044 


AL035864 


Hs.69517 


437789 


AI581344 


Hs.127812 


437852 


BE001836 


Hs.256897 


439223 


AW238299 


Hs.250618 


439477 


W69813 


Hs.58042 


439606 


W79123 


Hs.58561 


439738 


6E246502 


Hs.9598 


440006 


AK000517 


Hs.6844 


441362 


BE614410 


Hs.23044 


442117 


AW664984 


Hs.128699 


443247 


BE614387 


Hs.333B93 


443426 


AF098158 


Hs.9329 


443859 


NFVL013409 


Hs.9914 


444006 


BE395085 


Hs.10086 


444371 


BE540274 


Hs.239 


444381 


BE387335 


Hs.283713 


444781 


NM.014400 


Hs.11950 


445537 


AJ245671 


Hs.12844 


446619 


AU076643 


Hs.313 


446921 


AB012113 


Hs;i6530 


447033 


AI357412 


Hs.157601 


447342 


AI19926B 


Hs.19322 


448243 


AW359771 


Hs.52620 


448844 


AI581519 


Hs.177164 


449048 


Z45051 


Hs.22920 


449722 


BE280074 


Hs.23960 


450001 


NM.001044 


Hs.406 


450375 


AA009647 




450701 


H39960 


Hs.288467 


450983 


AA305384 


Hs.25740 


451668 


Z43948 


Hs.326444 


452281 


T93500 


Hs.28792 


452401 


NML007115 


Hs.29352 


452747 


BE153855 


Hs.61460 


452838 


U65011 


Hs.30743 


453968 


AA847843 


Hs.62711 


457489 


A1693B15 


Hs.127179 
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TABLE 14B 



Pkey. Unique Eos probeset identifier number 
CAT number Gene cluster number 
Accession: Genbank accession numbers 



Pkey 



CAT Number Accession 
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WO 02/086443 PCT/US02/12476 
414883 15024 1 AA926960 AA926959 W76521 W2427OVW1526MO37172BE267636 H83186 M4699O9N86396AA0O1348BE535736AAO8U 

M082436 H72525 H77575 N49786 W80565 H7B746 BE563085 W04339 R98127 T55938 BE279271 AW960304 T29812 AA476873 BE297387 
AA292753 AA177048 NM_C01826 X54941 BE314366M908783 A1719075 BE270172 BE269819 AA889955 AI2Q4630 W25243 AI935150 
AA872039 W72395 T99630 AI422691 H98460 N31428 BE25S916 H03265 AI857576 M776920 M910B44 AA459522 AA293140 AW514667 
R75953 AW662396 AA6S2522 A18S5147 AI423153 AW262230 AA584410 AA583187 AW024595 AW069734 AI828996 AA282997 AA876046 
AW61 3002 AA527373 AW972459 AI831360 AA621 337 AA1 00926 AA77241 8 AA594628 AI033892 W95096 Al 0343 17 AA398727 AI085031 
N95210 AI459432 AI041437 AA932124 AA6276B4 AA935829 AI004827 AI423513 AI094597 H42079 R54703 AI630359 AA6 17681 AA978045 
AA643250 W44551 A1991988 AI537692 A1O90262 AA740817 AI312104 AI91 1822 AA416871 AI185409 AA129784 AA701B23 AJ 07 5239 
AI139549 AA633648 AI339996 AI33S880AA399239 AI078708 A1085351 A1362835 A1346618 AI1469S5 AI989380 AI348243 N92892 AA765850 
AI494230 AI27B887 AA962596 AI492600 W80435 AA001979 R97424 AI129015 N24127 AA157451 AA235549 AA459292 AA0371 14 AA129785 
A1494211 AW059801 AW886710 R92790 N59755 AI36112B AW5B9407 H47725 H97534 H48076 H48450 T99631 AW300758 H03431 R76769 
AA954344 H77576 R96823 AI457100 N92845 N49682 H42038 BE220698 BE220715 H99552 AA701624 N74173 R54704 H79520 H72923 
H03266 BE261919 AA789633 AA480310 AA507454 AA910586 AI203723 AW104725 W2561 1 W25071 T88980 H03513 T77589 R99156 
W95095 R97470 AA702275 T77551 AA91 1 952 H82956 N83673 AA283S72 
450375 83327 1 AA009647 AA131254 AA374293 AW954405 H04410 AW60S284 AA151166 BE157467 BE157601 H04384 W46291 AW663674 H04021 H01532 

M190993H03231H59605 H01642M852876AA113758AA626915AA746952A1161014AA099554 R69067 



TABLE 14C 

Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source, The 7 digit numbers in this column are Genbank Identifier (GI) numbers. 'Dunham I. et a!.* refers to the publication entitled The DNA 

sequence of human chromosome 22* Dunham I. et a)., Nature (1999) 402:489-495. 

Strand: Indicates DNA strand from which exons were predicted. 

NLposition: Indicates nucleotide positions of predicted exons. 

Pkey Ref Strand NLposition 

402075 8117407 Phis 121907-122035,122804-122921,124019.124161,124455-124610,125672-126076 
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TABLE 15A* Information for all sequences in Table 16 

Table 15A shows foe Seq ID No. Pkey, ExAccn, UnigenelD, and Unlgene Title for afl of the sequences In Table 16, 

Table 15B show the accession numbers for those Pkey*s lacking UniganelCrs for table 15A. For each probeset we have listed the gene cluster number from which the 
oBgonucleoSdes were designed. Gene clusters were compiled using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence 
similarity using Clustering and Alignment Tools (DouWeTwist, Oakland CaGfomia). The Genbank accession numbers for sequences comprising each cluster are listed In the 
•Accession" column. 

Table 15C show the genomic positioning for those Pkey*s lacking Unigene ID's and accession numbers in table 15A. For each predicted exon, we have Bsted the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also listed. 



Seq ID No: Sequence ID number 

Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unlgene Title: Unigene gene title 



Seq ID No: 

Seq ID No: 1 & 2 
SeqlDNo:3&4 
SeqlDNo:5&6 
Seq ID No: 7 & 8 
Seq ID No: 9 & 10 
Seq ID No: 11 & 12 
Seq ID No: 13 & 14 
Seq ID No: 15&16 
Seq ID No: 17 & 18 
Seq ID No: 19 & 20 
Seq ID No: 21 & 22 
Seq ID No: 23 & 24 
SeqlDNo:25&26 
Seq ID No: 27 & 28 
Seq ID No: 29 & 30 
Seq ID No: 31 & 32 
SeqlDNo:33&34 
Seq ID No: 35 & 36 
Seq ID No: 37 & 38 
Seq ID No: 39 & 40 
Seq ID No: 41 & 42 
Seq ID No: 43444 
Seq ID No: 45 & 46 
Seq ID No: 47 & 48 
Seq ID No: 49 
Seq ID No: 50 & 51 
Seq ID No: 52 & 53 
Seq ID No: 54 & 55 
Seq ID No: 56 & 57 
Seq ID No: 58 & 59 
Seq ID No: 60 & 61 
1 SeqlDNo:62&63 
Seq ID No: 64 & 65 
Seq ID No: 66 & 67 
Seq ID No: 68 & 69 
Seq ID No: 70 & 71 
Seq ID No: 72 & 73 
Seq ID No: 74 & 75 
Seq ID No: 76 & 77 
Seq ID No: 78 & 79 
Seq ID No: 80 & 81 
Seq ID No: 82 & 83 
Seq ID No: 84 & 85 
Seq ID No: 86 & 87 
Seq ID No: 88 & 89 
Seq ID No: 90 & 91 
Seq ID No: 92 & 93 
Seq ID No: 94 & 95 
Seq ID No: 96 & 97 
Seq ID No: 98 & 99 
Seq ID No: 100 & 101 
Seq ID No: 102 & 103 
Seq ID No: 104 & 105 
Seq ID No: 106 & 107 
Seq ID No: 108 & 109 
Seq ID No: 110 & 111 
Seq ID No: 112 & 113 
Seq ID No: 114 & 115 
Seq ID No: 116 
Seq ID No: 117 & 118 
Seq ID No: 119 & 120 
Seq ID No: 121 & 122 
Seq ID No: 123&124 
Seq ID No: 125&126 



410407 

412719 

417034 

430486 

407788 

407788 

4077B8 

407788 

439285 

413753 

120486 

425650 ! 

412140 

423673 

452838 

418663 

418663 

409632 

429610 

406690 

431846 

418830 



443648 
311034 
408522 
422158 
435505 
417366 
431958 
441020 
423217 
429538 
■448733 
444371 
444371 
444371 
422168 
422168 
429259 
426440 
437044 
423662 
428484 
429211 
417389 
423634 
417515 
441362 
425322 
449003 
431009 
409103 
417542 
428471 
418004 
414761 
418203 
447343 
437016 
449230 



457819 
424687 



ExAccn 

X66839 

AW016610 

NMJ06183 

BE062109 

BE514982 

BE514982 

BE514982 

BE514982 

AL133916 

U17760 

AW368377 

NM_001944 

AA219691 

BE003054 

U65011 

AK001100 

AK001100 

W74001 

AB024937 

M29540 

BE019924 

BE513731 

AF077374 

A1085377 

BE567130 

AI541214 

L10343 

AF200492 

BE185289 

X63629 

W79283 

NM_000094 

BE182592 

NM.005629 

BE540274 

BE540274 

BE540274 

AA586894 

AA586894 

AA420450 

BE382756 

AL035864 

AK001035 

AF104032 

AF052693 

BE260964 

AW959908 

L24203 

BE614410 

U63630 

X76342 

BE149762 

AF251237 

J04129 

X57348 

U37519 

AU077228 

X54942 

AA256841 

AU076916 

BE613348 

AK001898 

AA057484 

J05070 



UnigenelD 

Hs.63287 

Hs.816 

Hs.80962 

Hs^4155l 

Hs.38991 

Hs.38991 

Hs.38991 

Hs.38991 

Hs.75517 

Hs.137569 

Hs.1925 

Hs.73625 

Hs.1 695 

Hs.30743 

Hs.41690 

Hs.41690 

Hs.55279 

Hs.211092 

Hs.220529 

Hs.271580 

Hs.88959 

Hs.139322 

Hs.143610 

Hs.311389 

Hs.46320 

Hs.1 12341 

Hs.211238 

Hs.1076 

Hs.2877 

Hs.35962 

Hs.1640 

Hs.1 1261 

Hs.1 87958 

Hs.239 

Hs.239 

Hs.239 

Hs.1 12408 

Hs.112403 

Hs.292911 

Hs.169902 

Hs.69517 

Hs.1 30881 

Hs.184601 

Hs.198249 

Hs.82045 

Hs.1690 

Hs.82237 

Hs.23044 

Hs.1 55637 

Hs.389 

Hs.48956 

Hs.1 12208 

Hs>82269 

Hs.184510 

Hs.87539 

Hs.77256 

Hs.83758 

Hs.236894 

Hs.5398 

Hs.211579 

Hs.16740 

Hs.35406 

Hs.151738 



Unigene Title 

carbonic anhydrase IX 
ESTs 

neurotensin 

chloride channel, calcium activated, fam 
S100 calcium-binding protein A2 
S100 calcium-binding protein A2 
S100 catclum-bindtng protein A2 
S100 calcium-binding protein A2 
hypothetical protein FU20093 
lamintn, beta 3 (nicein (125kD), kaGnin 
tumor protein 63 kDa with strong h omolog 
desmogleln 3 (pemphigus vulgaris antigen 
RAB6 interacting, klnesin-fike (rabkines 
matrix metaHoproteinase 12 {macrophage 
preferentially expressed antigen in mala 
desmocoIiin3 
desmocoHin 3 

serine (or cysteine) proteinase inhlblto 
LUNX protein; PLUNC (palate lung and nas 
carcino embryonic antigen-related cell ad 
uroplakin 1B 

hypothetical protein MGC4816 
small prollne-rich protein 3 
ESTs 

ESTs, Highly similar to NKGD_HUMAN NKG2- 

Small proDne-rich protein SPRK [human, 

protease inhibitor 3, skin-derived (SKAL 

interieukin-1 homolog 1 

small proline-rich protein 1B (comifin) 

cadherin 3, type 1, P-cadherin (placenta 

ESTs 

collagen, type VII, alpha 1 (epidermolys 
small prollne-rich protean 2A 
solute carrier family 6 (neurc 
forkhead box M1 
forkhead box Ml 
forkhead box Ml 

S100 calcium-binding protein A7 (psorias 
S100 calcium-binding protein A7 (psorias 
Plakophilin 

solute carrier family 2 (facilitated gtu 
differentially expressed in Fanconi's an 
B-ceB CLL/tymphoma 1 1A (zinc finger pro 
solute carrier family 7 (cationic amino 
gap junction protein, beta 5 (connexin 3 
midkine (neurite growth-promoting factor 
heparin-binding growth factor binding pr 
ataxia-telangiectasia group D-associated 
RAD51 (S. cerevisiae) homolog (E coll Re 
protein kinase, DNA-adivated, catalytic 
alcohol dehydrogenase 7 (class IV), mu o 
gap [unction protein, beta 6 (connexin 3 
XAGE-1 protein 

progestagen-assoclated endometrial prate 
stratifui 

aldehyde dehydrogenase 3 family, member 

enhancer of zeste (DrosophUa) homolog 2 

CDC28 protein kinase 2 

ESTs, Highly similar to S02392 alpha-2-m 

guanine monphosphate synthetase 

melanoma ceil adhesion molecule 

hypothetical protein FU11036 

ESTs, Highly similar to unnamed protein 

matrix metafloprotetnase 9 (gelaSnase B 
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WO 02/086443 

SeqlDNa127&128 414430 

SeqlDNa129&130 416462 

Seq ID Na 131 & 132 100668 

Seq ID Na 133 & 134 458933 

Seq ID No; 135& 136 418478 

SeqID No: 137& 138 418478 

Seq ID No: 139 & 140 418478 

Seq ID No: 141 & 142 418478 

Seq!DNo:143&144 446269 

Seq ID No: 145 & 146 422765 

Seq ID No: 147 & 148 436481 

Seq ID No: 149& 150 440325 

Seq ID No: 151 & 152 439606 

Seq ID No: 153 & 154 453884 

Seq ID No: 155 & 156 453884 

Seq ID No: 157 & 158 453884 

Seq ID No: 159 & 160 453884 

Seq ID No: 161 & 162 404877 

Seq ID No: 163 & 164 413129 

Seq ID Na 165 & 166 413281 

Seq ID No: 167 & 168 444781 

Seq ID No: 169 & 170 416819 

Seq ID No: 171 & 172 451320 

Seq ID No: 173 & 174 418543 

Seq ID No: 175 & 176 454034 

SeqID No: 177 & 178 425397 

Seq ID No: 179 & 180 415817 

Seq ID No: 181 & 182 415817 

Seq ID No: 183 & 184 415817 

Seq ID No: 165 & 186 415817 

Seq ID No: 187 & 188 415817 

Seq ID No: 189 &. 190 419121 

Seq ID No: 191 & 192 448993 

Seq ID No: 193 & 194 421817 

Seq ID No: 195 & 196 430393 

Seq ID No: 197 & 198 425057 

Seq ID No: 199 & 200 420462 

Seq ID No: 201 & 202 102963 

Seq ID No: 203 & 204 100576 

Seq ID No: 205 & 206 101175 

Seq ID No: 207 & 208 429038 

Seq ID No: 209 & 210 418678 

Seq ID No: 21 1& 212 418678 

SeqlDNo:213&214 131927 

Seq ID No: 215 & 216 428182 

Seq ID No: 217 & 218 427335 

Seq ID No: 219 & 220 409420 

Seq ID No: 221 & 222 114346 

Seq ID No: 223 & 224 438956 

Seq ID No: 225 & 226 404440 

Seq ID No: 227 & 228 415669 

Seq ID No: 229 & 230 103312 

Seq ID No: 231 & 232 320843 

Seq ID No: 233 429065 

Seq ID No: 234 & 235 446102 

Seq ID No: 236 & 237 330495 

Seq ID No: 238 413573 

Seq ID No: 239 & 240 428479 

Seq ID No: 241 & 242 428479 

Seq ID No: 243 & 244 332180 

Seq ID No: 245 437915 

Seq ID No: 246 & 247 441553 

Seq ID No: 248 & 249 331692 

Seq ID No: 250 & 251 429413 

Seq ID No: 252 & 253 422283 

Seq ID No: 254 & 255 448357 

Seq ID No: 256 & 257 446292 

Seq ID No: 258 & 259 416209 

Seq ID No: 260 & 261 453922 

Seq ID No: 262 & 263 424046 

Seq ID No: 264 & 265 439223 

Seq ID No: 266 & 267 429228 

Seq ID No: 268 & 269 409757 

Seq ID No: 270 & 271 411089 

Seq ID No: 272 & 273 436511 

Seq ID No: 274 & 275 428969 

Seq ID No: 276 & 277 428969 

Seq ID No: 278 & 279 428969 

Seq ID No: 260 & 281 428969 

Seq ID No: 282 407137 

SeqID No: 283 & 284 412723 

SeqID No: 285 & 286 450701 

Seq ID No: 287 & 288 405770 

Seq ID Na 289 & 290 439453 

Seq ID Na 291 & 292 414774 



AI346201 


Hs.76118 


BE001596 


Hs.85266 


L05424 


Hs.169610 


AI638429 


Hs^4763 


U38945 


Hs.1174 


U38945 


Hs.1174 


U38945 


Hs.1174 


U38945 


Hs.1174 


AW263155 


Hs.14559 


AW409701 


Hs.1678 


AA379597 


Hs.5199 


NMJ)03812 


Hs.7164 


W79123 


Hs.58561 


AA355925 


Hs.36232 


AA355925 


Hs.36232 


AA355925 


Hs.36232 


AA355925 


Hs.36232 


AF292100 


Hs.104613 


AA851271 


Hs.222024 


NM-014400 


Hs.11950 


U77735 


Hs.80205 


AW1 18072 




NM.005329 


Hs.85962 


NWL000691 


Hs.575 


J04088 


Hs.1 56346 


U88967 


Hs.78867 


U88967 


Hs.78867 


U88967 


Hs.78867 


U88967 


Hs.78867 


U88967 


Hs.78867 


AA374372 


Hs.89626 


AI471630 


Hs.8127 


AF146074 


Hs.106660 


BE1B5030 


Hs.241305 


AA826434 


Hs.1619 


AF050147 


Hs.97932 


X02404 


Hs.274534 


X00356 


Hs.37058 


U32671 


Hs.36980 


AL023513 


Hs.194766 


NM.001327 


Hs,167379 


NMJ01327 


Hs.167379 


AJ003112 


Hs.34780 


BE386042 


Hs.293317 


AA448542 


Hs.251677 


Z15008 


Hs.54451 


AL137256 


Hs.1 30489 


W00847 


Hs.135056 


NMJ)05025 


Hs.78589 


Y12642 


Hs.3185 


BE069288 


Hs.34744 


AI753247 


Hs.29643 


AW168067 


Hs.317694 


U47924 


Hs.71642 


A1733859 


Hs.149089 


Y00272 


Hs.334562 


Y00272 


Hs.334562 


AF134160 


Hs.7327 


AI637993 


Hs.202312 


AA281219 


Hs.121296 


AI683487 ■ 


Hs.152213 


NM 014058 


Hs.201877 


AW411307 


Hs.1 14311 


N20169 


Hs.108923 


AF081497 


Hs.279582 


AA236776 


Hs.79078 


AF053306 


Hs.36708 


AF027866 


Hs.1 38202 


AW236299 


Hs.250618 


A1553633 


Hs.326447 


NM.001898 


Hs.123114 


AA456454 


Hs.214291 


AA721252 


Hs^91502 


AF120274 


Hs.194689 


AF120274 


Hs.1 94689 


AF120274 


Hs.194689 


AF1 20274 


Hs.194689 


T97307 




AA648459 


Kte.335951 


H39960 


Hs^88467 


BE264974 


Hs.6566 


X02419 


Hs.77274 



ubiquiHn carboxyUerminal esterase L1 
integrfn, beta 4 

CD44 antigen (homing function and Indian 
RAN binding protein 1 
cydin-dependent kinase inhibitor 2A (me 
cycQn-dependent kinase inhibitor 2A (me 
cyd3rwjep3ndent kinase inhibitor 2A (me 
cycCn-dependent kinase inhibitor 2A (me 
hypothetical protein FU 10540 
bacufoviral IAP repeat-containing 5 (sur 
HSPC150 protein sintfar to ubiquitin-con 
a disintegrin and metaitoproteinase doma 
G protein-coupled receptor 87 
KlAAOIBSgena product 
KlAA0186gene product 
KIAA0186gene product 
KIAA0186gene product 
NM_005365:Homo sapiens melanoma antigen, 
RP42 homolog 
transcription factor BMAL2 
GPI -anchored metastasis-associated prote 
pin>2 oncogene 

diacylglycerol kinase, zeta (104kD) 
hyaluronan synthase 3 
aldehyde dehydrogenase 3 family, member 
topoisomerase (DNA) II alpha (1 70kD) 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
parathyroid hormone-like hormone 
WAA0144 gene product 
ATP-binding cassette, sub-family C (CFTR 
estrogen-responsive B box protein 
achaete-scute complex (Drosophila) homo) 
chondromoduiin I precursor 
calcitonin-related polypeptide, beta 
catcitonin/calcltoniiHelated polypeptid 
melanoma antigen, family A, 2 
seizure related gene 6 (mouseHike 
cancer/testls anUgen (NY-ESO-1) 
cancer/testis antigen (NY-ESO-1) 
doubiecortex; lissencephary, X-Bnked (d 
ESTs, Weakly similar to GGC1.HUMAN G ANT 
G antigen 76 

laminin, gamma 2 (nicein (100kD), kalini 
ATPase, arninophospholipid transporter-ii 
Human DNA sequence from clone RP5-850E9 
NM_021048:Homo sapiens melanoma antigen, 
serine (or cysteine) proteinase tnhibito •' 
lysosomal 

Homo sapiens mRNA; cDNA DKFZp547C136 (fr 
Homo sapiens cDNA FU 13103 Ms, clone NT 
ESTs 

guanine nucleotide binding protein (G pr 
ESTs 

cell division cycle 2, G1 to S and G2 to 
cell division cycle 2, G1 to S and G2 to 
claudin 1 

Homo sapiens clone Nil NTera2D1 teratoca 
ESTs 

wingless-type MMTV integration site farni 
DESC1 protein 

CDC45 (cell division cycle 45, Sxerevis 
RAB38, member RAS oncogene family 
Rh type C glycoprotein 
MAD2 (mitotic arrest deficient, yeast, h 
budding uninhibited by benzimidazoles 1 
serine (or cysteine) proteinase inhibito 
UL1 6 binding protein 2 
ESTs 

cystafinSN 

ceil division cycle 2-Hke 1 (PITSLRE pr 

ESTs 

artemin 

artemin 

artemin 

artemin 

goye53h05.s1 Soares fetal Cver spleen 
hypothetical protein AF301 222 
hypothetical protein XPJ)98151 (leucine- 
NM_002362:Homo sapiens melanoma antigen, 
thyroid hormone receptor Interactor 13 
plasminogen activator, urokinase 
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453968 
403478 
441525 
434105 



WO 02/086443 
Seq ID No: 293 4 294 424629 
SeqIDNo: 295 & 296 437789 
SeqIDNa 297 4 298 437789 
Seq ID No: 299 4 300 437789 
5 Seq ID No: 301 & 302 437789 
SeqlDNo:303&304 437789 
Seq ID No: 305 & 306 
Seq ID No: 307 & 308 
SeqIDNa 309 

10 Seq ID No: 310 & 311 

Seq ID No: 31 2 & 313 428810 
Seq ID No: 314 & 315 413691 
Seq ID No: 316 & 317 423934 
Seq ID No: 31 8 & 319 409226 

15 Seq ID No: 320 & 321 425734 
Seq ID No: 322 & 323 413582 
Seq ID No: 324 & 325 438403 
Seq ID No: 326 & 327 403329 
Seq ID Na 328 & 329 

20 Seq ID No: 330 & 331 
SeqIDNo: 332 & 333 
Seq ID No: 334 & 335 
Seq ID No: 336 & 337 
Seq ID No: 338 & 339 

25 Seq ID No: 340 & 341 
Seq ID No: 342 & 343 
Seq ID No: 344 4 345 1 34299 
Seq ID No: 346 & 347 412719 
Seq ID No: 348 & 349 422158 

30 Seq ID No: 350 & 351 128924 
Seq ID No: 352 & 353 100486 
Seq ID No: 354 & 355 419121 
Seq ID No: 356 & 357 409459 
Seq ID No: 358 & 359 330493 

35 Seq ID No: 360 & 361 
Seq ID No: 362 & 363 
Seq ID No: 364 & 365 
Seq ID No: 366 & 367 
Seq ID No: 368 & 369 440704 

40 Seq ID No: 370 & 371 431221 
Seq ID No: 372 & 373 431565 
Seq ID No: 374 & 375 431565 
Seq ID No: 376 & 377 132354 
Seq ID No: 3784 379 424441 

45 Seq ID No: 380 & 381 103768 
Seq ID No: 382 & 383 417512 
Seq ID No: 384 & 385 
SeqIDNo: 3864387 
Seq ID No: 388 4 389 

50 Seq ID No: 390 4 391 

Seq ID No: 392 4 393 418007 
Seq ID No: 394 4 395 418738 
Seq ID No: 396 4 397 
Seq ID No: 398 4 399 

55 Seq ID No: 400 4 401 
Seq ID No: 402 4 403 
Seq ID No: 404 4 405 
.Seq ID No: 406 4 407 
Seq ID No: 408 4 409 422867 

60 Seq ID No: 410 4 411 428227 
Seq ID No: 412 4 413 444381 
Seq ID No: 4144 415 • 400303 
Seq ID No: 41 6 4 417 411789 
Seq ID No: 418 4 419 428698 

65 • Seq ID Na 420 4 421 
Seq ID No: 422 4 423 
Seq ID Na 424 4 425 
Seq ID Na 426 4 427 
Seq ID Na 428 4 429 

70 Seq ID Na 430 4 431 
Seq ID Na 4324433 
Seq ID Na 4344435 427585 
Seq ID Na 436 4 437 442117 
Seq ID Na 438 4 439 431211 

75 Seq ID Na 440 4 441 
Seq ID Na 442 4 443 
Seq ID Na 4444445 
Seq ID Na 4464447 
Seq ID Na 448 4 449 

80 Seq ID Na 450 4 451 
SeqIDNa 452 4 453 
Seq 10 Na 454 4 455 
SeqIDNa 456 4 457 412420 
SeqIDNa 458 4 459 416658 

85 SeqIDNa 460 4 461 407811 
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119073 
113195 
102283 
101345 
103280 
102012 
105729 



417866 
418113 
437016 
429612 



425266 
424503 
400289 
418007 



415138 
418506 
423961 
414812 
417433 
417433 



450098 
421552 
452747 
450375 
426215 
425247 
432201 



447033 
447033 
447033 
115522 
410418 
409041 
409041 
452461 



M90656 

A1581344 

AI581344 

AI581344 

AI581344 

AI581344 

AAB47843 

AW241867 

AW952124 

AF068236 

AB023173 

U89995 

R16811 

AF056209 

AW295647 

AA806607 

AW247090 

BE245360 

H83265 

AW161552 

NVL005795 

U84722 

BE259035 

H46612 

AW580939 

AW016610 

L10343 

BE279383 

T19006 

AA374372 

D85407 

W27826 

AW067903 

AI272141 

AU076916 

AF062649 

M69241 

AA449015 

AF161470 

AF161470 

BE185289 

X14850 

AF086009 

X76534 

J00077 

NM.002205 

X07820 

M13509 

M13509 

AW388633 

C18356 

AA084248 

D13666 

X72755 

BE270266 

BE270266 

L32137 

AA321649 

BE387335 

AA242758 

AF245505 

AA852773 

W27249 

AF026692 

BE153855 

AA009647 

AW963419 

NMJJ05940 

A1538613 

D31152 

AW664964 

M86849 

A1357412 

AI357412 

AI357412 

BE614387 

D31382 

AB033025 

AB033025 

N78223 

AL035668 

U03272 

AW190902 



Hs.151393 
Hs.127812 
Hs.127812 
Hs.127812 
Hs.127812 
Hs.127812 
Hs.62711 

Hs. 127728 

Hs.13094 

Hs.193788 

Hs.75478 

Hs.159234 

Hs.22010 

Hs.159396 

Hs.71331 

Hs.292206 

Hs.57101 

Hs.279477 

Hs.8881 

HS.833B1 

Hs.152175 

Hs.76206 

Hs.118400 

Hs.293815 

Hs.97199 

Hs.816 

Hs.112341 

Hs.26557 

Hs.10842 

Hs.89626 

Hs.54481 

Hs.82772 

Hs.83484 

Hs.5398 

Hs.252587 

Hs.162 

Hs.286145 

Hs.260622 

Hs.260622 

Hs.1076 

Hs.147097 

Hs.296398 

Hs.82226 

Hs.155421 

H5.149609 

Hs.2258 

Hs.83169 

Hs.83169 

Hs.6682' 

Hs.295944 

Hs.85339 

Hs.136348 

Hs.77367 

Hs.82128 

Hs.82128 

Hs.1584 

Hs.2248 

Hs.283713 

Hs.79136 

Hs.72157 

Hs.334838 

Hs.8109 

Hs.105700 

Hs.61460 

Hs.155223 
Hs.155324 
Hs.298241 
Hs.179729 



gliitamate-cysteine ligase, catalytic sub 
ESTs, WeaWy similar to T1 7330 hypothetl 
ESTs, WeaHy similar to T17330 hypolheti 
ESTs. Weakly similar to T17330 hypoiheS 
ESTs, Weakly similar to T17330 hypotheQ 
ESTs. Weakly similar to T1 7330 hypolheti 
High mobility group (nonhistone chromoso 
NM_022342:Homo sapiens kinesin protein 9 
ESTs 

presenilis associated momboid-Bke pro 
nitric oxide synthase 2A {inducible, hep 
ATPase. Class VI, type 11B 
torkhead box E1 (thyroid transcription f 
ESTs. Weakly similar to 210926QA B cell 
peptidytglycine alpha-amldating monooxyg 
hypothetical protein MGC5350 
ESTs 

unnamed protein product [Homo sapiens] 
rrrinichrornosorne maintenance deficient (S. 
v-ets erythroblastosis virus E26 oncogen 
ESTs, Weakly similar to S41044 chromosom 
guanine nucleotide binding protein 11 
calcitonin receptor-like 
cadherin 5, type 2, VE-cadherin (vascula 
singed (DrosophBaJ-IDce (sea urchin fas 
Homo sapiens HSPC285 mRNA, partial cds 
complement component C1q receptor 
ESTs 

protease inhibitor 3, skin-derived (SKAL 
plakophiGn 3 

RAN, member RAS oncogene family 
parathyroid hormone-like hormone 
low density lipoprotein receptor-related 
endogenous retroviral protease 
collagen, type X!, alpha 1 
SRY (sex determining region Y)-box 4 
guanine monphosphate synthetase 
pituitary tumor-transforming 1 
insulin-like growth factor binding prole 
SRB7 (suppressor of RNA polymerase B, ye 
butyrate-lnduced transcript 1 
butyrate-induced transcript 1 
small proline-rich protein 1B (cornifin) 
H2A histone family, member X 
gb:Homo sapiens full length insert cDNA 
glycoprotein (transmembrane) nmb 



Hs.323733 

Hs.157601 

Hs-157601 

Hs.157601 

Hs.333893 

Hs.63325 

Hs.50081 

Hs.50081 

Hs.108106 

Hs.73853 

Hs.79432 



Integrin, alpha 5 (fibronectin receptor, 
matrix metalloproteinase 10 (stromelysin 
matrix metalloproteinase 1 (interstitial 
matrix metalloproteinase 1 (interstitial 
solute carrier family 7, (cationic amino 
tissue factor pathway inhibitor 2 
G protein-coupled receptor 39 
periostin (OSF-2os) 
monokine induced by gamma interferon 
5T 4 oncofetal trophoblast glycoprotein • 
5T4 oncofetal trophoblast glycoprotein 
cartilage oiigomeric matrix protein (pse 
small Inducible cytokine subfamily B (Cy 
ESTs, Weakly similar to S64054 hypolheti 
UV-1 protein, estrogen regulated 
Adltcan 

KIAA1866 protein ' 

hypothetical protein FU21080 

secreted frizzled-reiated protein 4 

Ig superfamGy receptor LNIR 

a disintegrin and metalloproteinase doma 

stanniocatcin 2 

matrix metalloproteinase 11 (stromelysin 
Transmembrane protease, serine 3 
collagen, type X, alpha 1 (Schmid metaph 
ESTs; hypotheflcal protein for IMAGE:447 
gap junction protein, beta 2. 26kD (conn 
ESTs 
ESTs 
ESTs 

c-Myc target JP01 
transmembrane protease, serine 4 
Hypothetical protein, XP_Q51860 (WAA1 19 
Hypothetical protein, XP_051850 (KIAA1 19 
transcription factor 
bone mofphogenetic protein 2 
fibrillin 2 (congenita) contracture! ara 
cysteine knot superfamily 1, BMP antagon 
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Seq ID No: 462 4 463 437852 

Seq ID No: 464 & 465 402075 

SeqlDNo:466&467 421110 

Seq ID No: 468 & 469 451668 

Seq ID No: 470 & 471 451668 

Seq ID No: 472 & 473 451668 

Seq ID No: 474 & 475 422282 

Seq ID No: 476 & 477 425852 

Seq ID No: 478 & 479 439738 

Seq ID No: 480 & 481 427747 

Seq ID No: 482 & 483 420281 

Seq ID No: 484 & 485 405932 

Seq ID No: 486 & 487 405932 

Seq ID No: 488 & 489 444342 

Seq]DNo:490&491 421379 

Seq ID No: 492 & 493 417079 

Seq ID No: 494 & 495 430890 

Seq ID No: 498 & 497 419721 

Seq ID No: 498 & 499 444471 

Seq ID No: 500 & 501 413063 

Seq ID No: 502 & 503 433800 

Seq 10 No: 504 & 505 452401 

Seq ID No: 506 & 507 452401 

Seq ID No: 508 & 509 450001 

Seq ID No: 510 & 511 410407 

Seq ID No: 512 & 513 309931 

Seq ID No: 514 & 515 412719 

Seq ID No: 516 & 517 417034 

Seq ID No: 518 & 519 430486 

Seq ID No: 520 & 521 413753 

Seq ID No: 522 & 523 425650 

Seq ID No: 524 & 525 423673 

Seq ID No: 526 & 527 418663 

Seq ID No: 528 & 529 418663 

Seq ID No: 530 & 531 429610 

Seq ID No: 532 & 533 406690 

Seq ID No: 534 & 535 431846 

Seq ID No: 536 & 537 422158 

Seq ID No: 538 & 539 431958 

Seq ID No: 540 & 541 437044 

Seq ID No: 542 & 543 428484 

Seq ID No: 544 & 545 429211 

Seq ID No: 546 & 547 417389 

Seq ID No: 548 & 549 431009 

Seq ID No: 550 & 551 417542 

Seq ID No: 552 & 553 449230 

Seq ID No: 554 & 555 410555 

Seq ID No: 556 & 557 410555 

Seq ID No: 558 & 559 424687 

Seq ID No: 560 & 561 418462 

Seq ID No: 562 & 563 410274 

Seq ID No: 564 & 565 439606 

Seq ID No: 566 & 567 404877 

Seq ID No: 568 & 569 444781 

Seq ID No: 570 & 571 418543 

Seq ID No: 572 & 573 415817 

Seq ID No: 574 & 575 415817 

Seq ID No: 576 & 577 415817 

Seq ID No: 578 & 579 415817 

Seq ID No: 580 & 581 415817 

Seq ID No: 582 & 583 415817 

Seq ID No: 584 & 585 421817 

Seq ID No: 586 & 587 418678 

Seq ID No: 588 & 589 418678 

Seq ID No: 590 & 591 409420 

Seq ID No: 592 & 593 332180 

Seq ID No: 594 & 595 408790 

Seq ID No: 596 & 597 408790 

Seq ID No: 598 & 599 439223 

Seq ID No: 600 & 601 409757 

Seq ID No: 602 & 603 428969 

Seq ID No: 604 & 605 428969 

Seq ID No: 606 & 607 428969 

Seq ID No: 608 & 609 428969 

Seq ID No: 610 & 611 450701 

Seq ID No: 612 & 613 450701 

Seq ID No: 614 & 615 414774 

Seq ID No: 616 & 617 407944 

Seq ID No: 618 & 619 407944 

Seq ID No: 620 & 621 457489 

Seq (D No: 622 & 623 429547 

Seq ID No: 624 & 625 407242 

Seq ID No: 626 & 627 407242 

Seq ID No: 628 & 629 407242 

Seq ID No: 630 & 631 444006 



BE001838 


Hs-256897 


AJ250717 


Hs.1355 


243948 


Hs.326444 


Z43948 


Hs.326444 


Z43948 


Hs.326444 


AF019225 


Hs.1 14309 


AKQ01504 


Hs.159651 


BE246502 


Hs.9598 


AW411425 


Hs.1 60655 


AI623693 


Hs.323494 


NM.014398 


Ha.10887 


Y15221 


Hs.103982 


U65590 


Hs.81134 


X54232 


Hs.2699 


NM.001650 


Hs.288650 


AB020684 


Hs.11217 


AL035737 


Hs.75184 


AI034361 


Hs.135150 


NMJJ07115 


Hs.29352 


NMJ07115 


Hs.29352 


NO01044 


Hs.406 


X66839 


Hs.63287 


AW341683 




AW016610 


Hs.816 


NMJ06183 


Hs.809S2 


BE062109 


Hs.241551 


U17760 


Hs.75517 


NMJ01944 


Hs.1925 


BE003054 


Hs.1695 


AK001100 


Hs.41690 


AK001100 


Hs.41690 


AB024937 


Hs.211092 


M29540 


Hs.220529 


BE019924 


Hs.271580 


L10343 


Hs.1 12341 


X63629 


Hs,2877 


AL035864 


Hs.69517 


AF104032 


Hs.184601 


AF052693 


Hs.198249 


BE260964 


Hs.82045 


BE149762 


Hs.48956 


J04129 


Hs.82269 


BE613348 


Hs.211579 


U92649 


Hs.64311 


U92649 


Hs.64311 


tnntvtn 
JUOU/U 


ns. 10 1 1 oo 


BEO01596 


Hs.85266 


AA381807 


Hs.61762 


W79123 


Hs.58561 


NMJM4400 


Hs.1 1950 


NM.005329 


Hs.85962 


UB8967 


HS.78B67 


U88967 


Hs.78867 


U88967 


Hs.78667 


U88967 


Hs.78667 


U88967 


Hs.78867 


U88967 


Hs.78867 


AF146074 ' 


Hs.108660 


NM-001327 


Hs.167379 


NNL001327 


Hs.167379 


Z15008 


Hs.54451 


AF134160 


Hs.7327 


AW580227 


Hs.47860 


AW580227 


Hs.47860 


AW238299 


Hs.250618 


NM_001898 


Hs.123114 


AF120274 


Hs.194689 


AF120274 


Hs.194689 


AF120274 


Hs.194689 


AF120274 


Hs.194689 


H39960 


Hs^88467 


H39960 


Hs.288467 


X02419 


Hs.77274 


R34008 


Hs.239727 


R34008 


Hs.239727 


AI693815 


Hs.127179 


AW009168 


Hs.99376 


M18728 




M16728 




M18728 




BE395085 


Hs.10086 



ESTs. Weakly similar to d)36501 2.1 [H.sa 
ENSP00000251056*:Rasma membrane calcium 
cathepsinE 

cartilage addle protein 1 
cartilage acidic protein 1 
cartilage acidic protein 1 
apotlpoproteln I 

death receptor 6, TNF superfamBy member 
sema domain, immunoglobulin domain (Ig), 
serine/threonine kinase 12 
Predicted cation efflux pump 
C15000305:gi|3806122(gb|AAC6919ai| (AF0 
C15000305:gp06122lgb|AAC6919ai| (AF0 
similar to lysoscme-associated membrane 
small inducible cytokine subfamily B (Cy 
interleukin 1 receptor antagonist 
gtypicanl 
aquaporin 4 
KIAA0877 protein 

chitinase 3-fike 1 (cartilage glycoproto 
rung type-l cell membrane-associated aty 
tumor necrosis factor, alpha-Induced pro 
tumor necrosis factor, alpha-induced pro 
solute carrier family 6 (neurotransmitle 
carbonic anhydrase IX 

gb:hd13d01.x1 Soares_NFU_TJ2BC_S1 Homos 
ESTs 

neurotensin 

chloride channel, calcium activated, fam 
laminin, beta 3 (nicein (125kD), kalinin 
desmoglein 3 (pemphigus vulgaris antigen 
matrix metalioproteinase 12 (macrophage 
desmocoliin 3 
desmocollin 3 

LUNX protein; PLUNC {palate lung and nas 
carclnoembryonic antigen-related cell ad 
uroplakb 1B 

protease inhibitor 3, skin-derived (SKAL 
cadherin 3, type 1, P-cadherin (placenta 
differentially expressed in Fanconrs an 
solute earner family 7 (cationic amino 
gap junction protein, beta 5 (connexin 3 
mldklne (neurite growth-promoting factor 
gap junction protein, beta 6 (connexin 3 
progestagen-associated endometrial prole 
melanoma cell adhesion molecule 
a disintegrin and metalioproteinase doma 
a disintegrin and metalioproteinase doma 
matrix metalioproteinase 9 (gelatinase B 
integrin, beta 4 
hypoxia-inducible protein 2 • 
G protein-coupled receptor 87 
NMJ)05365:Homo sapiens melanoma antigen, 
GPI-anchored metastasis-associated prote 
hyaluronan synthase 3 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
protein tyrosine phosphatase, receptor-t 
ATP-binding cassette, sub-family C (CFTR 
cancer/testis antigen (NY-ESO-1) 
cancer/testis antigen (NY-ESO-1) 
laminin, gamma 2 (nicein (100kD), kalira 
ctaudin 1 

neurotrophic tyrosine kinase, receptor, 

neurotrophic tyrosine kinase, receptor, 

UL1 6 binding protein 2 

cystatin SN 

arlemln 

artemin 

artemln 

artemin 

hypothetical protein XP_098151 (leucine. 
hypothetical protein XP.098151 (leucine- 
plasminogen activator, urokinase 
desmocollin 2 
desmocollin 2 
cryptic gene 
ESTs 

gb:Human nonspecific crossreading anQg 
gb:Human nonspecific crossreading antig 
gb:Human nonspedficcrossreacting antig 
type I transmembrane protein Fn14 
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SeqiD No: 632 & 633 
Seq ID Na 634 & 635 
Seq ID Na 636 & 637 
Seq ID Na 638 & 639 
Seq ID Na 640 & 641 
Seq ID No: 642 & 643 
Seq ID No: 644 & 645 
Seq ID No: 646 & 647 
Seq ID No: 648 & 649 
Seq ID No: 650&651 
Seq ID Na 652 & 653 
Seq ID No: 654 & 655 
Seq ID No: 656 & 657 
Seq ID Na 658 & 659 
Seq ID No: 660 & 661 
Seq ID Na 662 & 663 
Seq ID No: 664 & 665 
Seq ID No: 666 & 667 
Seq ID No: 666 & 669 
Seq ID Na* 670 & 671 
Seq ID Na 672 & 673 
Seq ID Na 674 & 675 
Seq ID Na 676 & 677 
Seq ID Na 678 4679 
Seq ID Na 680 & 681 
Seq ID Na 682 & 683 
Seq ID No: 684 & 685 
Seq ID Na 686 & 687 
Seq ID Na 688 & 689 

TABLE 15B 



PCT/US02/12476 



429597 


NM.003816 


Hs.2442 


a dsinlegrfo and mslaDoproteinase doma 


422109 


S73265 


Hs.1473 


gastrirweteasing peptide 


419235 


AW470411 


Hs^88433 


neurotrimin 


449048 


Z45051 


Hs.22920 


similar to S68401 (cattle) glucose Indue 


419216 


AU076718 


Hs.164021 


small inducible cytokine subfamily B (Cy 


431462 


AW583672 


Hs.256311 


granin-Cke neuroendocrine peptide precu 


448243 


AW3S9771 


hb.52620 


Integrf n. beta 8 


426427 


M86699 


Hs.169840 


TTK protein kinase 


445537 


AJ245671 


Hs.12844 


EGF-Jike-domaln, multiple 6 


422278 


AF072B73 


Hs.114218 


frizzled (Drosophlla) homoiog 6 


428450 


NM.014791 


Hs.184339 


K1AA0175 gene product 


446619 


AU076643 


Hs.313 


secreted phosphoprotein 1 (osteoponQn, 


453392 


1)23752 


Hs.32954 


SRY (sex deterrnining region Y)-box 11 


426514 


BE616633 


Hs.170195 


bone morphogenetic protein 7 (osteogenic 


425776 


U25128 


Hs.159499 


parathyroid hormone receptor 2 


425776 


U25128 


Hs.159499 


parathyroid hormone receptor 2 


431515 


NM.012152 


Hs.258583 


endothelial differentiation, tysophospha 


419452 


U33635 


Hs.90572 


PTK7 protein tyrosine kinase 7 


432653 


N62096 


Hs.293185 


ESTs, Weakly similar to JC7328 amino aci 


432653 


N62096 


Hs^93185 


ESTs, Weakly similar to JC7328 amino aci 


432653 


N62096 


Hs.293185 


ESTs, Weakly similar to JC7328 amino aci 


432653 


N62096 


Hs,293185 


ESTs, WeaWy similar to JC7328 amino aci 


410001 


AB041036 


Hs.57771 


kallikrein 11 


425501 


AW043782 


Hs.293516 


ESTs 


408369 


R38438 


Hs.182575 


solute earner family 15 (H77? transport 


445413 


AA151342 


Hs.12677 


CGM47 protein 


422424 


AI186431 


Hs.296638 


prostate differentiation factor 


428330 


L22524 


Hs.2256 


matrix metalloproteinase 7 (mairilysin, 


420610 


A16B3183 


Hs.99348 


distal-less homeoboxS 



Pkey: Unique Eos probeset identifier number 
CAT number Gene duster number 
Accession: Genbank accession numbers 



Pkey 
309931 
330493 
439285 



451320 



CAT Number 
AW341683 
33264.5 
47055.1 



450375 83327.1 



86576.1 



Accession 

M27826 R78416 AA307645 AW957879 AW957800 M633529 H03662 

AL133916 N79113AF086101 N76721 AW950828AA364013AW9556B4A1346341 AI867454 N54784 AI655270 AI421 279 AW014882 
AA775552 N62351 N59253 AA626243 AI341407 BE175639 AA456968 A1358918 AA457077 

AA009647 AA131254 M374293 AW954405 H04410 AW606284 AA151166 BE157467 BE157601 H04384 W46291 AW663674 H04021 H01532 
AA190993 H03231 H59605 H01642 AA852876 AA113758 AA626915 AA746952 AI161014 AA099554 R69067 
AW1 18072 AI631982 T15734 AA224195 AI701458 W20198 F26326 AA890570 N90552 AW071907 AI671352 AI375892 T03517 RB8265 
AI124088 AA224388 AI084316 AI354686 T33652 AI140719 AI72021 1 T03490 AI372637 T15415 AW205836 AA630384 T03515 T33230 
AA017131 AA443303 T33623 AI222556T33511 T33785AI419606 D55612 



TABLE 15C 



Pkey: Unique number corresponding to an Eos probeset 

Reft Sequence source. The 7 digit numbers in this column are Genbank Identifier (Gl) numbers, "Dunham I. et a!.' refers to the publication entitled "The DNA 

sequence of human chromosome 22.' Dunham I. et al, Nature (1 999) 402:48&495. 
Strand: Indicates DNA strand from which exons were predicted. 
NLpositlon: Indicates nucleotide positions of predicted exons. 

Pkey Ref Strand NLposiOon 

402075 8117407 Plus 121907.122035,122804-122921,124019-124161,124455.124610,125672.126076 

403329 . 8516120 Plus 96450-96598 

403478 9958258 Plus 116458-116564 

404440 7528051 Plus 80430-81581 

404877 1519284 Plus 1095-2107 

405770 2735037 Plus 61057-62075 

405932 7767812 Minus 123525-123713 
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Table 16 

Seq ID HO: 1 DNA sequence 

Nucleic Acid Accession ft: NMJ>01216 

Coding sequence : 43 1422 

1 11 21 31 41 51 

I I I I I I 

GCCCGTACAC ACCGTGTGCT GGGACACCCC ACAGTCAGCC GCATGGCTCC CCTGTGCCCC 60 

AGCCCCTGGC TCCCTCTGTT GATCCCGGCC CCTGCTCCAG GCCTCACTGT GCAACTGCTG 120 

CTGTCACTGC TGCTTCTGAT GCCTGTCCAT CCCCAGAGGT TGCCCCGGAT GCAGGAGGAT 180 

TCCCCCTTGG GAGGAGGCTC TTCTGGGGAA GATGACCCAC TGGGCGAGGA GGATCTGCCC 240 

AGTGAAGAGG ATTCACCCAG AGAGGAGGAT CCACCCGGAG AGGAGGATCT ACCTGGAGAG 300 

GAGGATCTAC CTGGAGAGGA GGATCTACCT GAAGTTAAGC CTAAATCAGA AGAAGAGGGC 360 

TCCCTGAAGT TAGAGGATCT ACCTACTGTT GAGGCTCCTG GAGATCCTCA AGAACCCCAG 420 

AATAATGCCC ACAGGGACAA AGAAGGGGAT GACCAGAGTC ATTGGCGCTA TGGAGGCGAC 480 

CCGCCCTGGC CCCGGGTGTC CCCAGCCTGC GCGGGCCGCT TCCAGTCCCC GGTGGATATC 540 

CGCCCCCAGC TCGCCGCCTT CTGCCCGGCC CTGCGCCCCC TGGAACTCCT GGGCTTCCAG 600 

CTCCCGCCGC TCCCAGAACT GCGCCTGCGC AACAATGGCC ACAGTGTGCA ACTGACCCTG 660 

CCTCCTGGGC TAGAGATGGC TCTGGGTCCC GGGCGGGAGT ACCGGGCTCT GCAGCTGCAT 720 

CTGCACTGGG GGGCTGCAGG TCGTCCGGGC TCGGAGCACA CTGTGGAAGG CCACCGTTTC 780 

CCTGCCGAGA TCCACGTGGT TCACCTCAGC ACCGCCTTTG CCAGAGTTGA CGAGGCCTTG 840 

GGGCGCCCGG GAGGCCTGGC CGTGTTGGCC GCCTTTCTGG AGGAGGGCCC GGAAGAAAAC 900 

AGTGCCTATG AGCAGTTGCT GTCTCGCTTG GAAGAAATCG CTGAGGAAGG CTCAGAGACT 960 

CAGGTCCCAG GACTGGACAT ATCTGCACTC CTGCCCTCTG ACTTCAGCCG CTACTTCCAA 1020 

TATGAGGGGT CTCTGACTAC ACCGCCCTGT GCCCAGGGTG TCATCTGGAC TGTGTTTAAC 1080 

CAGACAGTGA TGCTGAGTGC TAAGCAGCTC CACACCCTCT CTGACACCCT GTGGGGACCT 1140 

GGTGACTCTC GGCTACAGCT GAACTTCCGA GCGACGCAGC CTTTGAATGG GCGAGTGATT 1200 

GAGGCCTCCT TCCCTGCTGG AGTGGACAGC AGTCCTCGGG CTGCTGAGCC AGTCCAGCTG 1260 

AATTCCTGCC TGGCTGCTGG TGACATCCTA GCCCTGGTTT TTGGCCTCCT TTTTGCTGTC 1320 

ACCAGCGTCG CGTTCCTTGT GCAGATGAGA AGGCAGCACA GAAGGGGAAC CAAAGGGGGT 1380 

GTGAGCTACC GCCCAGCAGA GGTAGCCGAG ACTGGAGCCT AGAGGCTGGA TCTTGGAGAA 1440 

TGTGAGAAGC CAGCCAGAGG CATCTGAGGG GGAGCCGGTA ACTGTCCTGT CCTGCTCATT 1500 
ATGCCACTTC CTTTTAACTG CCAAGAAATT TTTTAAAATA AATATTTATA AT 

Seq ID NO: 2 Protein sequence: 
Protein Accession ft: NP_001207 

1 11 21 31 41 51 

r ■ i i i i i 

MAPLCPSPWL PLLIPAPAPG LTVQLLLSLL LLMPVHPQRL PRMQEDSPLG GGSSGEDDPL 60 

GEEDLPSEED SPREEDPPGE EDLPGEEDLP GEEDLPEVXP KSEEEGSLKL EDLPTVEAPG 120 

DPQEPQNNAH RDKEGDDQSH WRYGGDPPWP RVSPACAGRF QSPVDIRPQL AAFCPALRPL 180 

ELLGPQLPPL PELRLRNNGH SVQLTLPPGL EMALGPGREY RALQLHLHWG AAGRPGSEHT 240 

VEGHRFPAEI HWHLSTAFA RVDEALGRPG GLAVLAAFLE EGPEENSAYE QLLSRLEEIA 300 

EEGSETQVPG LDISALLPSD FSRYFQYEGS LTTPPCAQGV IWTVFKQTVM LSAKQLHTLS 360 

DTLWGPGDSR LQLNFRATQP LNGRVIEASF PAGVDSSPRA AEPVQLNSCL AAGDILALVF 420 
GLLFAVTSVA FLVQMRRQHR RGTKGGVSYR PAEVAETGA 

Seq ID NO: 3 DNA sequence 

Nucleic Acid Accession #: BC013923 

Coding sequence: 438-1391 

1 11 21 31 41 51 

I I I I I I 

AGCGGGGTTG TCTATTAACT TGTTCAAAAA GTATCAGGAG TTGTCAAGGC AGAGAAGAGA 60 

GTGTTTGCAA AAGGGGGAAA GTAGTTTGCT GCCTCTTTAA GACTAGGACT GAGAGAAAGA 120 

AGAGGAGAGA GAAAGAAAGG GAGAGAAGTT TGAGCCCCAG GCTTAAGCCT TTCCAAAAAA 180 

TAATAATAAC AATCATCGGC GGCGGCAGGA TCGGCCAGAG GAGGAGGGAA GCGCTTTTTT 240 

TGATCCTGAT TCCAGTTTGC CTCTCTCTTT TTTTCCCCCA AATTATTCTT CGCCTGATTT 300 

TCCTCGCGGA GCCCTGCGCT CCCGACACCC CCGCCCGCCT CCCCTCCTCC TCTCCCCCCG 360 

CCCGCGGGCC CCCCAAAGTC CCGGCCGGGC CGAGGGTCGG CGGCCGCCGG CGGGCCGGGC 420 

CCGCGCACAG CGCCCGCATG TACAACATGA TGGAGACGGA GCTGAAGCOG CCGGGCCCGC 480 

AGCAAACTTC GGGGGGCGGC GGCGGCAACT CCACCGCGGC GGCGGCCGGC GGCAACCAGA 540 

AAAACAGCCC GGACCGCGTC AAGCGGCCCA TGAATGCCTT CATGGTGTGG TCCCGCGGGC 600 

AGCGGCGCAA GATGGCCCAG GAGAACCCCA AGATGCACAA CTCGGAGATC AGCAAGCGCC 660 

TGGGCGCCGA GTGGAAACTT TTGTCGGAGA CGGAGAAGCG GCCGTTCATC GACGAGGCTA 720 

AGCGGCTGCG AGCGCTGCAC ATGAAGGAGC ACCCGGATTA TAAATACCGG CCCCGGCGGA 780 

AAACCAAGAC GCTCATGAAG AAGGATAAGT ACACGCTGCC CGGCGGGCTG CTGGCCCCCG B40 

GCGGCAATAG CATGGCGAGC GGGGTCGGGG TGGGCGCCGG CCTGGGCGCG GGCGTGAACC 900 

AGCGCATGGA CAGTTACGCG CACATGAACG GCTGGAGCAA CGGCAGCTAC AGCATGATGC 960 

AGGACCAGCT GGGCTACCCG CAGCACCCGG GCCTCAATGC GCACGGCGCA GCGCAGATGC 1020 

AGCCCATGCA CCGCTACGAC GTGAGCGCCC TGCAGTACAA CTCCATGACC AGCTCGCAGA 1080 

CCTACATGAA CGGCTCGCCC ACCTACAGCA TGTCCTACTC GCAGCAGGGC ACCCCTGGCA 1140 

TGGCTCTTGG CTCCATGGGT TCGGTGGTCA AGTCCGAGGC CAGCTCCAGC CCCCCTGTGG 1200 

TTACCTCTTC CTCCCACTCC AGGGCGCCCT GCCAGGCCGG GGACCTCCGG GACATGATCA 1260 

GCATGTATCT CCCCGGCGCC GAGGTGCCGG AACCCGCCGC CCCCAGCAGA CTTCACATGT 1320 

CCCAGCACTA CCAGAGCGGC CCGGTGCCCG GCACGGCCAT TAACGGCACA CTGCCCCTCT 1380 

CACACATGTG AGGGCCGGAC AGCGAACTGG AGGGGGGAGA AATTTTCAAA GAAAAACGAG 1440 

GGAAATGGGA GGGGTGCAAA AGAGGAGAGT AAGAAACAGC ATGGAGAAAA CCCGGTACGC 1500 

TCAAAAAAAA AAAAAAAAAA AAAATCCCAT CACCCACAGC AAATGACAGC TGCAAAAGAG 1560 

AACACCAATC CCATCCACAC TCACGCAAAA ACCGCGATGC CGACAAGAAA ACTTTTATGA 1620 

GAGAGATCCT GGACTTCTTT TKGGGGGACT ATTTTTGTAC AGAGAAAACC TGGGGAGGGT 1680 

GGGGAGGGCG GGGGAATGGA CCTTGTATAG ATCTGGAGGA AAGAAAGCTA CGAAAAACTT 1740 

TTTAAAAGTT CTAGTGGTAC GGTAGGAGCT TTGCAGGAAG TTTGCAAAAG TCTTTACCAA 1800 

TAATATTTAG AGCTAGTCTC CAAGCGACGA AAAAAATGTT TTAATATTTG CAAGCAACTT 1860 

TTGTACAGTA TTTATCGAGA TAAACATGGC AATCAAAATG TCCATTGTTT ATAAGCTGAG 1920 
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AATTTGCCAA TATTTTTCAA GGAGAGGCTT CTTGCTGAAT TTTGATTCTC CAGCTGAAAT 1980 

TTAGGACAGT TGCAAACGTG AAAAGAAGAA AATTATTCAA ATTTGGACAT TTTAATTGTT 2040 

TAAAAATTGT ACAAAAGGAA AAAATTAGAA TAAGTACTGG CGAACCATCT CTGTGGTCTT 2100 

GTTTAAAAAG GGCAAAAGTT TTAGACTGTA CTAAATTTTA TAACTTACTG TTAAAAGCAA 2160 

AAATGGCCAT GCAGGTTGAC ACCGTTGGTA ATTTATAATA GCTTTTGTTC GATCCCAACT 2220 

TTCCATTTTG TTCAGATAAA AAAAACCATG AAATTACTGT GTTTGAAATA TTTTCTTATG 2280 

GTTTGTAATA TTTCTGTAAA TTTATTGTGA TATTTTAAGG TTTTCCCCCC TTTATTTTCC 2340 

GTAGTTGTAT TTTAAAAGAT TCGGCTCTGT ATTATTTGAA TCAGTCTGCC GAGAATCCAT 2400 

GTATATATTT GAACTAATAT CATCCTTATA ACAGGTACAT TTTCAACTTA AGTTTTTACT 2460 

CCATTATGCA CAGTTTGAGA TAAATAAATT TTTGAAATAT GGACACTGAA AAAAAAAAAA 2520 

AAAAAAACAA AACAAAAAAA CAAAAAACAA AAACAGAAAA AACAAAAAAA AAAACAAAAC 2580 

CACAACACAA AAACAAAAAA AAAAAAAAGA AACAAACACA CAACACAACA CAACACAAAA 2640 
CCACAACACA AACAACAACA CACAGAGGG 

Seq ID NO: 4 Protein sequence: 
Protein Accession 8:CAA83435.1 



PCT/US02/12476 



20 
25 



I 

MYNMMETELK 
QENPKMHNSE 
KKDKYTLPGG 
PQHPGLNAHG 
GSWKSEASS 
GPVPGTAING 



11 



21 



PPGPQQTSGG GGGNSTAAAA 
ISKRLGAEWK LLSETEKRPP 
LLAPGGNSMA SGVGVGAGLG 
AAQMQPMHRY DVSALQYNSM 
SPPWTSSSH SRAPCQAGDL 
TLPLSHM 



31 

I 

GGNQKNSPDR 
IDEAKRLRAL 
AGVNQRMDSY 
TSSQTYMNGS 
RDMISMYLPG 



41 

i 

VKRPMNAFMV 
HMKEHPDYKY 
AHMNGWSNGS 
PTYSMSYSQQ 
AEVPEPAAPS 



51 
I 

WSRGQRRKMA 
RPRRKTKTLM 
YSMMQDQLGY 
GTPGMALGSM 
RLHMSQHYQS 



60 
120 
180 
240 
300 



30 



35 



40 



45 



Seq ID NO i 5 DNA sequence 
Nucleic Acid Accession #j U91618 
Coding sequence: 29-541 



i 

CGGACTTGGC 
CATGCTACTC 
AGCATTAGAA 
TCCCTCTTGG 
AGCTGAGGAA 
TGCTTTAGAT 
TCACAGCAGG 
TGACAAAAAT 
GCTGTATGAG 
AGAGAATAAA 
ATTATATTTG 
ATTGAATGTG 
TCTTCAAAAA 



11 

I 

TTGTTAGAAG 
CTGGCTTTCA 
GCAGATTTCT 
AAGATGACTC 
ACAGGAGAAG 
GGCTTTAGCT 
GCTTTTCAAC 
GGAAAGGAAG 
AATAAACCCA 
TCATTTATTT 
TGTGAAAATG 
TTTTTCTGCA 
AAAAAAAAAA 



21 
I 

GCTGAAAGAT 
GCTCCTGGAG 
TGACCAATAT 
TGCTAAATGT 
TTCATGAAGA 
TGGAAGCAAT 
ACTGGGAGTT 
AAGTCATAAA 
GAAGACCCTA 
ACATGTGATT 
TGACAAACAC 
CTAATAGAAA 
AAATGGGGCC 



31 

[ • 

GATGGCAGGA 
TCTGTGCTCA 
GCATACATCA 
TTGCAGTCTT 
GGAGCTTGTT 
GTTGACAATA 
AATCCAGGAA 
GAGAAAAATT 
CATACTCAAA 
GTGATTCATC 
ACTTATCTGT 
TTAGACTAAG 
GCAATT 



41 
I 

ATGAAAATCC 
GATTCAGAAG 
AAGATTAGTA 
GTAAATAATT 
GCAAGAAGGA 
TACCAGCTCC 
GATATTCTTG 
CCTTATATTC 
AGAGATTCTT 
ATCCCTTAAT 
CTCTTCTACA 
TGTTTTCAAA 



51 

I 

AGCTTGTATG 
AGGAAATGAA 
AAGCACATGT 
TGAACAGCCC 
AACTTCCTAC 
ACAAAATCTG 
ATACTGGAAA 
TGAAACGGCA 
ACTATTACTG 
TAAATATCAA 
ATTGTGGTTT 
TAAATCTAAA 



60 
120 
180 
240 
300 
360 
420 
4B0 
540 
600 
660 
720 



50 



Seq ID NO: 6 Protein sequence: 
Protein Accession (h AABS0564 



1 11 21 31 .41 51 

I I I I I I 

MMAGMKIQLV CMLLIiAPSSW SLCSDSEEEM KALEADFLTN MHTSKISKAH VPSWKMTLLH 
55 VCSLVNHUJS PAEETGEVHE EEXjVARRKLP TALDGFSLEA MLTIYQLHKI CHSRAFQHWE 
LIQEDILDTG NDKNGKEEVI KRKIPYILKR QLYENKPRRP YILKRDSYYY 



60 
120 



60 
65 
70 
75 
80 
85 



Seq ID NO : 7 DNA sequence 

Nucleic Acid Accession #: NM_006536.2 

Coding . sequence : 1-0 9 - 2 94 o 



ACCTAAAACC 
ATGTATGCAG 
AGCATTGCAG 
GAACTCCCAT 
ATTGCAATTA 
ATAACTGAAG 
ATAAAGATTT 
TCATATGAAA 
TACACCCTAC 
TTCCTACTGA 
GAATGGGCCC 
ATAAATGGGC 
GTGTGTGAAA 
GGATGCACCT 
AGTTTATCTT 
CTACAGAACC 
TTTCACCACA 
GTACAGGCTG 
GCTGACAGAC 
ATTCATACCT 
CACCAAATTA 
TCAGCTAAAA 
AAACTGAATG 
CTTCTTGGCA 



11 
I 

TTGCAAGTTC 
CAGGCTCAGT 
GTCCTATTTG 
TCCTGGGAGC 
ATCCTCAGGT 
CTTCATTTTA 
TAATACCTGC 
AGGCAAATGT 
AATACAGAGG 
ATGATAACTT 
ACCTCCGTTG 
AAAATCAAAT 
AAGGTCCTTG 
TTATCTACAA 
CTGTGGTTGA 
AGATGTGCAG 
GCTTTCCCAT 
GTGACAAAGT 
TCCTTCAACT 
TCGTGGGCAT 
ACAGCAATGA 
CAGACATCAG 
GAAAAGCTTA 
ATTGCTTACC 



21 
I 

AGGAAGAAAC 
GTGAGTGAAC 
CAACCTGAAG 
TGGAGTACAG 
ACCTGAGAAT 
CCTATTTAAT 
CACATGGAAA 
CATAGTGACT 
GTGTGGAAAA 
AACAGCTGGC 
GGGTGTGTTC 
TAAAGTGACA 
CCCCCAAGAA 
TAGCACCCAA 
ATTTTGTAAT 
CCTCAGAAGT 
GAATGGGACT 
GGTCTGTTTA 
ACAACAAGCC 
TGCCAGTTTC 
TGATCGAAAG 
CATTTGTTCA 
TGGCTCTGTG 
CACTGTGCTC 



31 

I 

CATCTGCATC 
TGGAGGCTTC 
TTTGTGACTC 
CTTCAAGACA 
CAGAACCTCA 
GCTACCAAGA 
GCTAATAATA 
GACTGGTATG 
GAGGGAAAAT 
TACGGATCAC 
GATGAGTATA 
AGGTGTTCAT 
AACTGTATTA 
AATGCAACTG 
GCAAGTACCC 
GCATGGGATG 
GAGCTTCCAC 
GTGCTGGATG 
GCAGAATTTT 
GACAGCAAAG 
TTGCTGGTTT 
GGGCTTAAGA 
ATGATATTAG 
AGCAGTGGTT 



41 
I 

CATATTGAAA 
TCTACAACAT 
TCCTGGTTGC 
ATGGGTATAA 
TCTCAAACAT 
GAAGAGTATT 
ACAGCAAAAT 
GGGCACATGG 
ACATTCATTT 
GAGGCCGAGT 
ACAATGACAA 
CTGACATCAC 
TTAGTAAGCT 
CATCAATAAT 
ACAACCAAGA 
TAATCACAGA 
CTCCTCCCAC 
TGTCCAGCAA 
ATTTGATGCA 
GAGAGATCAG 
CATATCTGCC 
AAGGATTTGA 
TGACCAGCGG 
CAACAATTCA 



51 



ACCTGACACA 
GACCCAAAGG 
CTTAAGTTCA 
TGGATTGCTC 
TAAGGAAATG 
TTTCAGAAAT 
AAAACAAGAA 
AGATGATCCA 
CACACCTAAT 
GTTTGTCCAT 
ACCTTTCTAC 
AGGCATTTTT 
TTTTAAAGAA 
GTTCATGCAA 
AGCACCAAAC 
CTCTGCTGAC 
ATTCTCGCTT 
GATGGCAGAG 
GATTGTTGAA 
AGCCCAGCTA 
CACCACTGTA 
GGTGGTTGAA 
AGATGATAAG 
CTCCATTGCC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 



190 



WO 02/086443 PCT/US02/1 2476 

CTGGGTTCAT C7GCAGCCCC AAATCTGGAG GAATTATCAC GTCTTACAGG AGGTTTAAAG 1500 

TTCTTTGTTC CAGATATATC AAACTCCAAT AGCATGATTG ATGCTTTCAG TAGAATTTCC 1560 

TCTGGAACTG GAGACATTTT CCAGCAACAT ATTCAGCTTG AAAGTACAGG TGAAAATGTC 1620 

AAACCTCACC ATCAATTGAA AAACACAGTG ACTGTGGATA ATACTGTGGQ CAACGACACT 1680 

5 ATGTTTCTAG TTACGTGGCA GGCCAGTGGT CCTCCTGAGA TTATATTATT TGATCCTGAT 1740 

GGACGAAAAT ACTACACAAA TAATTTTATC ACCAATCTAA CTTTTCGGAC AGCTAGTCTT 1800 

TGGATTCCAG GAACAGCTAA GCCTGGGCAC TGGACTTACA CCCTGAACAA TACCCATCAT 1860 

TCTCTGCAAG CCCTGAAAGT GACAGTGACC TCTOGCGCCT CCAACTCAGC TGTGCCCCCA 1920 

GCCACTGTGG AAGCCTTTGT GGAAAGAGAC AGCCTCCATT TTCCTCATCC TGTGATGATT 1980 

10 TATGCCAATG TGAAACAGGG ATTTTATCCC ATTCTTAATG CCACTGTCAC TGCCACAGTT 2040 

GAGCCAGAGA CTGGAGATCC TGTTACGCTG AGACTCCTTG ATGA7GGAGC AGGTGCTGAT 2100 

GTTATAAAAA ATGATGGAAT TTACTCGAGG TATTTTTTCT CCTTTGCTGC AAATGGTAGA 2160 

TATAGCTTGA AAGTGCATGT CAATCACTCT CCCAGCATAA GCACCCCAAC CCACTCTATT 2220 

CCAGGGAGTC ATGCTATGTA TGTACCAGGT TACACAGCAA ACGGTAATAT TCAGATGAAT 2280 

15 GCTCCAAGGA AATCAGTAGG CAGAAATGAG GAGGAGCGAA AGTGGGGCTT TAGCCGAGTC 2340 

AGCTCAGGAG GCTCCTTTTC AGTGCTGGGA GTTCCAGCTG GCCCCCACCC TGATGTGTTT 2400 

CCACCATGCA AAATTATTGA CCTGGAAGCT GTAAAAGTAG AAGAGGAATT GACCCTATCT 2460 

TGGACAGCAC CTGGAGAAGA CTTTGATCAG GGCCAGGCTA CAAGCTATGA AATAAGAATG 2520 

AGTAAAAGTC TACAGAATAT CCAAGATGAC TTTAACAATG CTATTTTAGT AAATACATCA 2580 

20 AAGCGAAATC CTCAGCAAGC TGGCATCAGG GAGATATTTA CGTTCTCACC CCAGATTTCC 2640 

ACGAATGGAC CTGAACATCA GCCAAATGGA GAAACACATG AAAGCCACAG AATTTATGTT 2700 

GCAATAOGAG CAATGGATAG GAACTCCTTA CAGTCTGCTG TATCTAACAT TGCCCAGGCG 2760 

CCTCTGTTTA TTCCCCCCAA TTCTGATCCT GTACCTGCCA GAGATTATCT TATATTGAAA 2820 

GGAQTTTTAA CAGCAATGGG TTTGATAGGA ATCATTTGCC TTATTATAGT TGTGACACAT 2660 

25 CATACTTTAA GCAGGAAAAA GAGAGCAGAC AAGAAAGAGA ATGGAACAAA ATTATTATAA 2940 

ATAAATATCC AAAGTGTCTT CCTTCTTAGA TATAAGACCC ATGGCCTTCG ACTACAAAAA 3000 

CATACTAACA AAGTCAAATT AACATCAAAA CTGTATTAAA ATGCATTGAG TTTTTGTACA 3060 

ATACAGATAA GATTTTTACA TGGTAGATCA ACAATTCTTT TTGGGGGTAG ATTAGAAAAC 3120 

CCTTACACTT TGGCTATGAA CAAATAATAA AAATTATTCT TTAAAGTAAT GTCTTTAAAG 3180 

30 GCAAAGGGAA GGGTAAAGTC GGACCAGTGT CAAGGAAAGT TTGTTTTATT GAGGTGGAAA 3240 

AATAGCCCCA AGCAGAGAAA AGGAGGGTAG GTCTGCATTA TAACTGTCTG TGTGAAGCAA 3300 

TCATTTAGTT ACTTTGATTA ATTTTTCTTT TCTCCTTATC TGTGCAGTAC AGGTTGCTTG 3360 

TTTACATGAA GATCATGCTA TATTTTATAT ATGTAGCCCC TAATGCAAAG CTCTTTACCT 3420 

CTTGCTATTT TGTTATATAT ATTTCAGATG ACATCTCCCT GCTAATGCTC AGAGATCTTT 3480 

35 TTTCACTGTA AGAGGTAACC TTTAACAATA TGGGTATTAC CTTTGTCTCT TCATACCGGT 3540 

TTTATGACAA AGGTCTATTG AATTTATTTG TNTGTAAGTT TCTACTCCCA TCAAAGCAGC 3600 

TTTCTAAGTT TATTGCCTTG GGTTATTATG GAATGATAGT TATAGCCCCN TATAATGCCT 3660 

TACCTAGGAA A 

40 Seq ID NO: 8 Protein sequence: 

Protein Accession #: NP_006527.1 

1 11 21 31 41 51 

A - I I I I I I 

43 MTQRSIAGPI CNLKFVTLLV ALSSELPFLG AGVQLQDNGY KGLIiIAINPQ VPENQNLISN 60 

IKEMITEASF YLFNATKRRV FFRNIKILIP ATWKANNNSK IKQESYEKAN VIVTDWYGAH 120 

GDDPYTLQYR GCGKEGKYIH FTPNFLLNDN LTAGYGSRGR VFVHEWAHLR WGVFDEYNND 180 

KPFYINGQNQ IKVTRCSSDI TGIFVCEKGP CPQENCIISK LFKEGCTFIY NSTQNATAS I 240 

MFMQSLSSW EFCNASTHNQ EAPNLQNQMC SLRSAWDVIT DSADFHHSFP MNGTELPPPP 300 

50 TFSLVQAGDK WCLVLDVSS KMAEADRLLQ LQQAAEFYLM QIVEIHTFVG IASFDSKGEI 360 

RAQLHQINSN DDRKLLVSYL PTTVSAKTDI SI CSGLKKGF EWEKLNGKA YGSVMILVTS 420 

GDDKLLGNCL PTVLSSGSTI HSIAJiGSSAA PNLEELSRLT GGLKFFVPDI SNSNSMIDAF 480 

SRISSGTGDI FQQHIQLEST GENVKPHHQL KNTVTVDNTV GNDTMFLVTW QASGPPEIIL 540 

FDPDGRKYYT NNFITNLTFR TASLWIPGTA KPGHWTYTLN NTHHSLQALK VTVTSRASNS 600 

55 AVPPATVEAF VERDSLHFPH PVMIYANVKQ GFYPILNATV TATVEPETGD PVTLRLLDDG 660 

AGADVIKNDG IYSRYFFSFA ANGRYSLKVH VNHSPSISTP AHSIPGSHAM YVPGYTANGN 720 

IQMNAPRKSV GRNEEERKWG FSRVSSGGSF SVLGVPAGPH PDVFPPCKII DLEAVKVEEE 780 

LTIiSWTAPGE DFDQGQATSY EIRMSKSLQN IQDDFNNAIL VNTSKRNPQQ AGIREIFTFS 840 

PQISTNGPEH QPNGETHESH RIYVAIRAMD RNSLQSAVSN IAQAPLFIPP NSDPVPARDY 900 
60 LILKGVLTAM GLIGIICLII WTHHTLSRK KRADKKENGT KLL 



Seq ID NO: 9 DNA sequence 
Nucleic Acid Accession #: Eos sequence 
65 Coding sequence: 336-632 

1 11 21 31 41 51 

I I I I I I 

CTCCCCTCAC CCCGGTCCAG GATGCCCAGT CCCCACGACA CCTCCCACTT CCCACTGTGG 60 

70 CCTGGGTGGG CTCAGGGGCT GCCCTTGACC TGGCCTAGAG CCCTCCCCCA GCTGGTGGTG 120 

GAGCTGGCAC TCTCTGGGAG GGAGGGGGCT GGGAGGGAAT GAGTGGGAAT GGCAAGAGGC 180 

CAGGGTTTGG TGGGATCAGG TTGAGGCAGG TTTGGTTTCC TTAAAATGCC AAGTTGGGGG 240 

CCAGTGGGGC CCACATATAA ATCCTCACCC TGGGAGCCTG GCTGCCTTGC TCTCCTTCCT 300 

GGGTCTGTCT CTGCCACCTG GTCTGCCACA GATCCATGAT GTGCAGTTCT CTGGAGCAGG 360 

75 CGCTGGCTGT GCTGGTCACT ACCTTCCACA AGTACTCCTG CCAAGAGGGC GACAAGTTCA 420 

AGCTGAGTAA GGGGGAAATG AAGGAACTTC TGCACAAGGA GCTGCCCAGC TTTGTGGGGG 480 

AGAAAGTGGA TGAGGAGGGG CTGAAGAAGC TGATGGGCAG CCTGGATGAG AACAGTGACC 540 

AGCAGGTGGA CTTCCAGGAG TATGCTGTTT TCCTGGCACT CATCACTGTC ATGTGCAATG 600 

ACTTCTTCCA GGGCTGCCCA GACCGACCCT GAAGCAGAAC TCTTGACTTC CTGCCATGGA 660 

80 TCTCTTGGGC CCAGGACTGT TGATGCCTTT GAGTTTTGTA TTCAATAAAC TTTTTTTGTC 720 

TGTTGATAAT ATTTTAATTG CTCAGTGATG TTCCATAACC CGGCTGGCTC AGCTGGAGTG 780 

CTGGGAGATG AGGGCCTCCT GGATCCTGCT CCCTTCTGGG CTCTGACTCT CCTGGAAATC 840 

TCTCCAAGGC CAGAGCTATG CTTTAGGTCT CAATTTTGGA ATTTCAAACA CCAGCAAAAA 900 

ATTGGAAATC GAGATAGGTT GCTGACTTTT ATTTTGTCAA ATAAAGATAT TAAAAAAGGC 960 

85 AAATACCA 

Seq ID NO: 10 Protein sequence: 



191 



WO 02/086443 

Protein Accession #t NP_005969.1 

1 11 21 31 41 51 

I I I I I I 

MMCSSLEQAL AVLVTTFHKY SCQEGDKFKL SKGEMKELLH KELPSFVGEK VDEEGLKKLM 

GSLDENSDQQ VDFQEYAVPL ALITVMCNDF FQ6CPDRP 



PCT/US02/12476 



60 



10 
15 
20 
25 
30 
35 
40 



Seq ID NO » 11 DNA sequence 

Nucleic Acid Accession 8: Eos sequence 

Coding sequence: 336-626 



CTCCCCTCAC 
CCTGGGTGGG 
GAGCTGGCAC 
CAGGGTTT6G 
CCAGTGGGGC 
GGGTCTGTCT 
CGCTGGCTGT 
AGCTGAGTAA 
ATTCCAGAGA 
GAGACTTGAG 
GGAGAAAGTG 
CCAGCAGGTG 
TGACTTCTTC 
GATCTCTTGG 
TCTGTTGATA 
TGCTGGGAGA 
TCTCTCCAAG 
AAATTGGAAA 
GCAAATACCA 



11 

I 

CCCGGTCCAG 
CTCAGGGGCT 
TCTCTGGGAG 
TGGGATCAGG 
CCACATATAA 
CTGCCACCTG 
GCTGGTCACT 
GGGGGAAATG 
ACCATGTGCT 
AAACCAGAGC 
GATGAGGAGG 
GACTTCCAGG 
CAGGGCTGCC 
GCCCAGGACT 
ATATTTTAAT 
TGAGGGCCTC 
GCCAGAGCTA 
TCGAGATAGG 



21 
I 

GATGCCCAGT 
GCCCTTGACC 
GGAGGGGGCT 
TTGAGGCAGG 
ATCCTCACCC 
GTCTGCCACA 
ACCTTCCACA 
AAGGAACTTC 
GTGAGGGCCT 
CCAGAAGGGA 
GGCTGAAGAA 
AGTATGCTGT 
CAGACCGACC 
GTTGATGCCT 
TGCTCAGTGA 
CTGGATCCTG 
TGCTTTAGGT 
TTGCTGACTT 



31 
I 

CCCCACGACA 
TGGCCTAGAG 
GGGAGGGAAT 
TTTGGTTTCC 
TGGGAGCCTG 
GATCCATGAT 
AGTACTCCTG 
TGCACAAGGA 
TCCGAGTCCA 
AAAGTGATTG 
GCTGATGGGC 
TTTCCTGGCA 
CTGAAGCAGA 
TTGAGTTTTG 
TGTTCCATAA 
CTCCCTTCTG 
CTCAATTTTG 
TTATTTTGTC 



41 
I 

CCTCCCACTT 
CCCTCCCCCA 
GAGTGGGAAT 
TTAAAATGCC 
GCTGCCTTGC 
GTGCAGTTCT 
CCAAGAGGGC 
GCTGCCCAGC 
TCTGTTTAAT 
TCCCAAGATC 
AGCCTGGATG 
CTCATCACTG 
ACTCTTGACT 
TATTCAATAA 
CCCGGCTGGC 
GGCTCTGACT 
GAATTTCAAA 
AAATAAAGAT 



51 

1 

CCCACTGTGG 
GCTGGTGGTG 
GGCAAGAGGC 
AAGTTGGGGG 
TCTCCTTCCT 
CTGGAGCAGG 
GACAAGTTCA 
TTTGTGGGGC 
CCTGTCATTG 
ACACAGCACT 
AGAACAGTGA 
TCATGTGCAA 
TCCTGCCATG 
ACTTTTTTTG 
TCAGCTGGAG 
CTCCTGGAAA 
CACCAGCAAA 
ATTAAAAAAG 



Seq ID NO: 12 Protein sequence: 
Protein Accession #: Eos sequence 



11 



51 



21 31 41 

I ! I I I I 

MMCSSLEQAL AVLVTTFHKY SCQEGDKFKL SKGEMKELLH KELPSFVGHS REPCAVRAFR 
VHLPNPVIGD LRNQSPEGKS DCPKITQHWR KWMRRG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
7B0 
840 
900 
960 
1020 
1080 



60 



45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO: 13 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 58-354 



GTGAGCTCAC 
ATGTGCAGTT 
TGCCAAGAGG 
GAGCTGCCCA 
AGCCTGGATG 
CTCATCACTG 
ACTCTTGACT 
TATTCAATAA 
CCCGGCTGGC 
GGCTCTGACT 
GAATTTCAAA 
AAATAAAGAT 



11 
I 

CATGTGGGGG 
CTCTGGAGCA 
GCGACAAGTT 
GCTTTGTGGG 
AGAACAGTGA 
TCATGTGCAA 
TCCTGCCATG 
ACTTTTTTTG 
TCAGCTGGAG 
CTCCTGGAAA 
CACCAGCAAA 
ATTAAAAAAG 



21 
I 

TGAGGCTGAG 
GGCGCTGGCT 
CAAGCTGAGT 
GGAGAAAGTG 
CCAGCAGGTG 
TGACTTCTTC 
GATCTCTTGG 
TCTGTTGATA 
TGCTGGGAGA 
TCTCTCCAAG 
AAATTGGAAA 
GCAAATACCA 



Seq ID NO: 14 Protein sequence: 
Protein Accession #: NP_005969.1 
1 11 21 



31 

I 

AGAAAACAAG 
GTGCTGGTCA 
AAGGGGGAAA 
GATGAGGAGG 
GACTTCCAGG 
CAGGGCTGCC 
GCCCAGGACT 
ATATTTTAAT 
TGAGGGCCTC 
GCCAGAGCTA 
TCGAGATAGG 



31 



41 

I 

TACACAGCCA 
CTACCTTCCA 
TGAAGGAACT 
GGCTGAAGAA 
AGTATGCTGT 
CAGACCGACC 
GTTGATGCCT 
TGCTCAGTGA 
CTGGATCCTG 
TGCTTTAGGT 
TTGCTGACTT 



51 
I 

CAGATCCATG 
CAAGTACTCC 
TCTGCACAAG 
GCTGATGGGC 
TTTCCTGGCA 
CTGAAGCAGA 
TTGAGTTTTG 
TGTTCCATAA 
CTCCCTTCTG 
CTCAATTTTG 
TTATTTTGTC 



51 



41 

1 ! I I I 

MMCSSLEQAL AVLVTTFHKY SCQEGDKFKL SKGEMKELLH KELPSFVGEK VDEEGLKKLM 
GSLDENSDQQ VDFQEYAVFL ALITVMCNDF FQGCPDRP 



Seq ID NO: 15 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 62-358 



GGAGGGTGTG 
CATGATGTGC 
CTCCTGCCAA 
CAAGGAGCTG 
GGGCAGCCTG 
GGCACTCATC 
CAGAACTCTT 
TTTGTATTCA 
ATAACCCGGC 
TCTGGGCTCT 
TTTGGAATTT 



11 
I 

CCGCTGAGTC 
AGTTCTCTGG 
GAGGGCGACA 
CCCAGCTTTG 
GATGAGAACA 
ACTGTCATGT 
GACTT CCTG C 
ATAAACTTTT 
TGGCTCAGCT 
GACTCTCCTG 
CAAACACCAG 



21 
I 

ACTGCCTGGG 
AGCAGGCGCT 
AGTTCAAGCT 
TGGGGGAGAA 
GTGACCAGCA 
GCAATGACTT 
CATGGATCTC 
TTTGTCTGTT 
GGAGTGCTGG 
GAAATCTCTC 
CAAAAAATTG 



31 
I 

CATCTGGGCC 
GGCTGTGCTG 
GAGTAAGGGG 
AGTGGATGAG 
GGTGGACTTC 
CTTCCAGGGC 
TTGGGCCCAG 
GATAATATTT 
GAGATGAGGG 
CAAGGCCAGA 
GAAATCGAGA 



41 
I 

TGGAACCTCG 
GTCACTACCT 
GAAATGAAGG 
GAGGGGCTGA 
CAGGAGTATG 
TGCCCAGACC 
GACTGTTGAT 
TAATTGCTCA 
CCTCCTGGAT 
GCTATGCTTT 
TAGGTTGCTG 



51 

I 

GCCACAGATC 
TCCACAAGTA 
AACTTCTGCA 
AGAAGCTGAT 
CTGTTTTCCT 
GACCCTGAAG 
GCCTTTGAGT 
GTGATGTTCC 
CCTGCTCCCT 
AGGTCTCAAT 
ACTTTTATTT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
.600 
"660 



60 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



192 



10 



WO 02/086443 

TGTCAAATAA AGATATTAAA AAAGGCAAAT ACCA 

Seq ID NO: IS Protein sequence! 
Protein Accession »: NP_00S969.1 

1 11 21 31 41 51 

I I I I I I 

MMCSSLEQAL AVLVTTFHKV SCQEGDKFKL SKGEMKELLH KELPSPVGEK VDEEGLKKLM 

GSLDENSDQQ VDFQEYAVFL ALITVMCNDF FQGCPDRP 



SO 



PCT/US02/12476 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO: 17 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 939-2372 



AAGACGGATT 
AGAATAGTTA 
CCCGAGGCTC 
CGCGGGCGCA 
AACCAAGCAC 
CTGTGGTGTG 
AATTTTCTGG 
GCAGAGGCGG 
CGCTCGCGCT 
GGCGGGCGTG 
AACAAGGAAT 
GGGCCATGCA 
GGTTCGCTAT 
ACCAGCTCAG 
GAGAGCCGCA 
GCGGGGACAG 
ATGGACCCGC 
CCGCTTTCGC 
CTTCTCCTGG 
TCACCGAAAT 
AAGCTTATGT 
ATAAAGCATT 
CGAGTTTGTC 
ATCCATTTAC 
GTCCAGACAC 
ACCTGCAGAT 
TGGAGGAAGG 
TGTATTGGGA 
GCTCCTTAAG 
CGGAAAATCT 
CTATCACATT 
AAGGCAACCC 
AATACATCTG 
TGGATAATCC 
GGAAGGATGA 
CAAACCCAAA 
GGGACACCAC 
GGGAACATCT 
TGGTAATGCT 
TTTTGTTTCA 
GGCTGTGGTG 
CTTATCCCGG 
GCTGTACTAT 
GTAACTCTCA 
TTACAGTAGT 
AAAGTGTGCT 
TTGACCTGCA 
GTCTAATCTA 
TTCAGAGGGT 
ACCATCACTT 
GTCCAAATGT 
TTAGCCAGCA 
AAGAAAAAAA 
GGCATAGTCA 
TTGCCCTTTT 
AGGCCACAGT 
CCTGGGAGCA 
CCCCCTACAA 
CCAGCAGCAA 
TGCTGAGAGG 
ACACCTGTTT 
CAGGCAGTAT 
TGTTCCTTTT 
AATCAGCTCT 
TTTTAAAAAT 
TCTATAGATT 
CAAGGCATTA 
TCCATAATGA 
ATGAAGACCT 



11 
I 

CTCAGACAAG 
CGGTTTGTCA 
TGCCCGCGCC 
GGGCACGCGT 
GGTTTCCATT 
AATTAGGGAC 
AGTTTCTGCC 
CGGCGGCGGC 
CTACGOGCTC 
AGGCGCCGGA 
CTGCGCCCCA 
GCGACGGCCG 
GCCGGGGCCA 
CCTCTGATAA 
AGCGCAGGGA 
GCACTCGGGC 
CATGGCGCGG 
CTGTCCCACG 
CATCGTGGCA 
TTTCATCGCA 
GGGACTGAGA 
TCTGAAAAAC 
TAGGAAACAT 
ATGCTCCTGT 
TCAGGATTTG 
ACCCAATTGT 
AAAGTCTATC 
TGTTGGTAAC 
GATAACTAAC 
TGTAGGAGAA 
TCTCGAATCT 
CAAACCAGCG 
TACTAAAATA 
CACTCACATG 
GAAACAGATT 
TTATCCTGAT 
GAACAGAAGT 
CTCGGTCTAT 
GTTTCTGCTT 
TAAGATCCCA 
CTTGTTGGTT 
GAAGTGCTGC 
ATGAAGCCTG 
GGCAGCTAAG 
TCAAATACAA 
TTTTGACCCT 
AAGTTAAAAA 
CATGTAACAC 
TTGACTTTTT 
TGGGACTTGG 
TTAGCTTAGG 
AAACAAAACA 
ACAAGAACAA 
ATTTCAGAAT 
TTATTTGCCC 
ATCTCATGCT 
GAATGGCTGG 
CATACTGTCA 
AGAGGTGGCA 
GCAGCCTTAG 
CATTCACTTT 
GCTTGTCCTG 
TTATCAGGAG 
GACAGTTAGA 
TTATTTTTTT 
TTTAACTAGT 
ATCTTAATAA 
ATATTTTATA 
TTCACAGAAT 



21 
I 

GCTTGCAAAT 
CCCGACCCTC 
TGGCTTCTTC 
TCGCGCACAC 
TCAAAAAGGG 
GGGGAGGCGT 
CCTGCTCTGC 
TCCCGGAATT 
AGTCCCCGGC 
GCCCGGCCTC 
GAGAGTCCCG 
CCGCGGAGCT 
CTGTGAACCC 
GCTGGACTCG 
AGGCCTCCCC 
TGGCACTGGC 
CTCTGGGGCT 
TCCTGCAAAT 
TTTCCGAGAT 
AACCAGAAAA 
AATCTGACAA 
AGCAACCTGC 
TTCCGTCACC 
GACATTATGT 
TACTGCCTGA 
GGTTTGCCAT 
ACATTATCCT 
CTGGTTTCCA 
ATTTCATCCG 
GATCAAGATT 
CCAACCTCAG 
CTTCAGTGGT 
CATGTTACCA 
AACAATGGGG 
TCTGCTCACT 
GTAATTTATG 
AATGAAATCC 
GCTGTGGTGG 
AAGTTGGCAA 
CTGGATGGGT 
GATGCTGCCA 
TTATCTGGGG 
CATATACTGT 
CAGCACCTCA 
AACTGAAATG 
ACTGGACATT 
AAAATTAAAG 
ATATTTTAGT 
CATCTATAAC 
TAGTATTATT 
TCTGAGAGTC 
AAACAAAACA 
GCAGCAACAG 
AACTAAGAGT 
TCTGCGATCC 
GTTTGCATTA 
CCTGCTGTGA 
TACTGCTGGG 
GGTCGCTAAT 
AGCTGTGGAT 
AGCATCACAG 
AAGAGAGGTT 
GACTTCAGAG 
CATGCACACA 
TGGAAATAGT 
CCAACACAGT 
ACCAGGATCC 
CTGCATCCTT 
CCTATGGATT 



31 

I 

GCCCCGCAGC 
CCGGATCGCC 
GTAGCTGGAT 
CCTAGCACAC 
AGACAGCCTC 
CGAACGGAGG 
GTCAGCCCTC 
GGGTTGGAGC 
GGTAGCAGGA 
GAGGTGCATA 
GGAGCGCCGC 
CCGAGCAGCG 
TGCCGCCTGC 
GCACGCCCGC 
GCACGGGTGG 
TGCTAGGGAT 
TCTGCTGGCT 
GCAGTGCCTC 
TGGAGCCTAA 
GGTTAGAAAT 
TTGTGGATTC 
AGCACATCAA 
TTGACTTGTC 
GGATCAAGAC 
ATGAAAGCAG 
CTGCAAATCT 
GTAGTGTGGC 
AACATATGAA 
ATGACAGTGG 
CTGTCAACCT 
ACCACCACTG 
TCTATAACGG 
ATCACACGGA 
ACTACACTCT 
TCATGGGCTG 
AAGATTATGG 
CTTCCACAGA 
TGATTGCGTC 
GACACTCCAA 
AGCTGAAATA 
TGTAAGCTGG 
TTTTCTGGTA 
GAGCTGTGAT 
AGAAAACATG 
AAATCCCATT 
TATTGACTTA 
TTGAGAACAG 
GTGATTTTCT 
ACAGTGACTA 
AAAAGGTTAT 
AAACAATGTT 
AACAAATGAA 
CTGTTTTGTT 
GGAATATATG 
ACCTGCTTTT 
CAGAACTGCA 
GCAGGAGAGG 
TTTTCATGGG 
GAATATATGC 
TTCTGCATCC 
TGACCTTTGT 
TGGCTATCCC 
CCAGGCCTGC 
GACGCCATAG 
TGCACAAATG 
CAGAAACATT 
ATTTAGGTAC 
TACATTAGCC 
GCAGCATTTC 



41 
I 

CATCATTTAA 
TAATTTGTCC 
GCATATCGTG 
ATGAACACGC 
TACCGCGATT 
AACGGTTCAT 
ACGTCACTTC 
AGGAGCCTCG 
GCCTGGACCC 
CCGGACCCCC 
CGGTCGGTGC 
GTAGCGCCCC 
CGGAACACTC 
AACAAGCACC 
GGGAAAGCGG 
GTCGTCCTGG 
GGTTGTGGGC 
TCGGATCTGG 
CAGTGTAGAT 
CATCAACGAA 
TGGATTAAAA 
TTTTACCCGA 
TGAACTGATC 
TCTCCAAGAG 
CAAGAATATT 
GGCCGCACCT 
AGGTGATCCG 
TGAAACAAGC 
GAAGCAGATC 
CACTGTGCAT 
GTGCATTCCA 
GGCAATATTG 
GTACCACGGC 
AATAGCCAAG 
GCCTGGAATT 
AACTGCAGCG 
CGTCACTGAT 
TGTGGTGGGA 
GTTTGGCATG 
AAGGAAAAGA 
ACTCCTGGGA 
GATGTGGGCG 
TGGGGAACAC 
TTAAATTAAT 
GGATTGTACT 
ATTGCTTCTG 
GTATAAGTGC 
ATACTCTAAT 
AAAGAGTTAA 
TTCCTTCACT 
AAGGATTGTC 
AAACGTTTAA 
GGGGCTATAG 
CATATGGTGA 
TAGAAGTCTG 
GCTTTTCTAC 
AGATTCTAAG 
TAGGAAAGCT 
TTTATAATGT 
CCCCTGAGTC 
ATGCTCTGTT 
CACCCCACCC 
AGCATTTTGT 
CTGGATTGGA 
CTGCAATTTA 
GTTTTGAATC 
CACTTGATAT 
ACTAAATACG 
ACTTGGCTAC 



51 

I 

CTGCACCCGC 
CTAGTGAGAC 
CTCCGGGCAG 
GCAAGAGCTG 
GTAGAAGAGA 
CTTAGAGACT 
GCCAGCAGTA 
CTGGCTGCTT 
AGGCGCCGCC 
ATTCGCATCT 
CCGGCGCGCC 
CCTGTAAAGC 
TTCGCTCOGG 
GAGGAGTTAA 
CCGGTGCAGC 
ATAAGGTGGC 
TTCTGGAGGG 
TGCAGCGACC 
CCTGAGAACA 
GATGATGTTG 
TTTGTGGCTC 
AACAAACTGA 
CTGGTGGGCA 
GCTAAATCCA 
CCCCTGGCAA 
AACCTCACTG 
GTTCCTAATA 
CACACACAGG 
TCTTGTGTGG 
TTTGCACCAA 
TTCACTGTGA 
AATGAGTCCA 
TGCCTCCAGC 
AATGAGTATG 
GACGATGGTG 
AATGACATCG 
AAAACCGGTC 
TTTTGCCTTT 
AAAGGTTTTG 
CAGAGAAAGG 
CTGCTGTTGG 
GTGTTTGGAG 
CAATGCAGAG 
GCTTCTCTTC 
TCTCTTCTGA 
TTTATTAAAA 
ACACTGAATA 
CAGCACTGAA 
GGGTATATAT 
GTCAATAAAA 
TTAAAGTTCC 
AAAGAAGAAG 
ATTTAAGTTA 
AATTATAACC 
CCGAGTGAGA 
TCTGAAAAGG 
AAGGATAGTC 
TGTCCTGACC 
CCTTCTTCAT 
TGACCCATGG 
CAGTCTGTGT 
CACCCCACCC 
TTGAAAACAC 
AACATTGATG 
GCTTTAAGGT 
CTCTGTAAAC 
AAAAAGGATA 
TTATTGCTTG 
TTCATACCCA 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
252*0 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
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TGCCTTAAAG AGGGGCAGTT TCTCAAAAGC AGAAACATGC CGCCAGTTCT CAAGTTTTCC 4200 

TCCTAACTCC ATTTGAATGT AAGGGCAGCT GGCCCCCAAT GTGGGGAGGT CCGAACATTT 4260 

TCTGAATTCC CATTTTCTTG TTCGCGGCTA AATGACAGTT TCTGTCATTA CTTAGATTCC 4320 

GATCTTTCCC AAAGGTGTTG ATTTACAAAG AGGCCAGCTA ATAGCAGAAA TCATGACCCT 4380 

GAAAGAGAGA TGAAATTCAA GCTGTGAGCC AGGCAGGAGC TCAGTATGGC AAAGGTTCTT 4440 

GAGAATCAGC CATTTGGTAC AAAAAAGATT TTTAAAGCTT TTATGTTATA CCATGGAGCC 4500 

ATAGAAAGGC TATGGATTGT TTAAGAACTA TTTTAAAGTG TTCCAGACCC AAAAAGGAAA 4560 

AATAAAAAAA AAGGAATATT TGTACCCAAC AGCTAGAAGG ATTGCAAGGT AGATTTTTGT 4620 

TTTAAAATGG AGAGAAGTGG ACAGATAAGG CCATTTAATA TATCAAAGAT CAGTTGACAT 4680 
CTCCTAGGGA ATGATGAAAA CAGCAGGCTA T 

Seq ID NO i 18 Protein sequence: 
Protein Accession ft: CAA53571 

1 11 21 31 41 51 

I I I I I I 

MSSWIRWHGP AMARLWGFCW LWGPWRAAF ACPTSCKCSA SRIWCSDPSP GIVAFPRLEP 60 

NSVDPENITE IFIANQKRLE I INEDDVEAY VGLRNLTIVD SGLKFVAHXA FLKNSNLQHI . 120 

NFTRNKLTSL SRKHFRHLDL SELILVGNPF TCSCDIMWIK TLQEAKSSPD TQDLYCLNES 180 

SKNIPLANLQ IPNCGLPSAN LAAPNLTVEE GKSITLSCSV AGDPVPNMYW DVGNLVSKHM 240 

NETSHTQGSL RITNISSDDS GKQISCVAEN LVGEDQDSVN LTVHFAPTIT FLESPTSDHH 300 

WCIPFTVKGN PKPALQWFYN GAILNSSKYI CTKIHVTNHT EYHGCIiQLDN PTHMNNGDYT 360 

LIAKNEYGKD EKQISAHFMG WPGIDDGANP NYPDVIYEDY GTAANDIGDT TNRSNEIPST 420 
DVTDKTGREH LSVYAWVIA SWGFCLLVM LFLLKLARHS KFGMKGFVLF HKIPLDG 



Seq ID HO: 19 DNA sequence 

Nucleic Acid Accession #: NM__00022B 

Coding. sequence: 82-3600 

1 11 21 31 41 51 

I I I I I I 

GCTTTCAGGC GATCTGGAGA AAGAACGGCA GAACACACAG CAAGGAAAGG TCCTTTCTGG 60 

GGATCACCCC ATTGGCTGAA GATGAGACCA TTCTTCCTCT TGTGTTTTGC CCTGCCTGGC 120 

CTCCTGCATG CCCAACAAGC CTGCTCCCGT GGGGCCTGCT ATCCACCTGT TGGGGACCTG 180 

CTTGTTGGGA GGACCCGGTT TCTCCGAGCT TCATCTACCT GTGGACTGAC CAAGCCTGAG 240 

ACCTACTGCA CCCAGTATGG CGAGTGGCAG ATGAAATGCT GCAAGTGTGA CTCCAGGCAG 300 

CCTCACAACT ACTACAGTCA CCGAGTAGAG AATGTGGCTT CATCCTCCGG CCCCATGCGC 360 

TGGTGGCAGT CCCAGAATGA TGTGAACCCT GTCTCTCTGC AGCTGGACCT GGACAGGAGA 420 

TTCCAGCTTC AAGAAGTCAT GATGGAGTTC CAGGGGCCCA TGCCCGCCQG CATGCTGATT 480 

GAGCGCTCCT CAGACTTCGG TAAGACCTGG CGAGTGTACC AGTACCTGGC TGCCGACTGC 540 

ACCTCCACCT TCCCTCGGGT CCGCCAGGGT CGGCCTCAGA GCTGGCAGGA TGTTCGGTGC 600 

CAGTCCCTGC CTCAGAGGCC TAATGCACGC CTAAATGGGG GGAAGGTCCA ACTTAACCTT 660 

ATGGATTTAG TGTCTGGGAT TCCAGCAACT CAAAGTCAAA AAATTCAAGA GGTGGGGGAG 720 

ATCACAAACT TGAGAGTCAA TTTCACCAGG CTGGCCCCTG TGCCCCAAAG GGGCTACCAC 780 

CCTCCCAGCG CCTACTATGC TGTGTCCCAG CTCCGTCTGC AGGGGAGCTG CTTCTGTCAC 840 

GGCCATGCTG ATCGCTGCGC ACCCAAGCCT GGGGCCTCTG CAGGCCCCTC CACCGCTGTG 900 

CAGGTCCACG ATGTCTGTGT CTGCCAGCAC AACACTGCCG GCCCAAATTG TGAGCGCTGT 960 

GCACCCTTCT ACAACAACCG GCCCTGGAGA CCGGCGGAGG GCCAGGACGC CCATGAATGC 1020 

CAAAGGTGCG ACTGCAATGG GCACTCAGAG ACATGTCACT TTGACCCCGC TGTGTTTGCC 1080 

GCCAGCCAGG GGGCATATGG AGGTGTGTGT GACAATTGCC GGGACCACAC CGAAGGCAAG 1140 

AACTGTGAGC GGTGTCAGCT GCACTATTTC CGGAACCGGC GCCCGGGAGC TTCCATTCAG 1200 

GAGACCTGCA TCTCCTGCGA GTGTGATCCG GATGGGGCAG TGCCAGGGGC TCCCTGTGAC 1260 

CCAGTGACCG GGCAGTGTGT GTGCAAGGAG CATGTGCAGG GAGAGCGCTG TGACCTATGC 1320 

AAGCCGGGCT TCACTGGACT CACCTACGCC AACCCGCAGG GCTGCCACCG CTGTGACTGC 1380 

AACATCCTGG GGTCCCGGAG GGACATGCCG TGTGACGAGG AGAGTGGGCG CTGCCTTTGT 1440 

CTGCCCAACG TGGTGGGTCC CAAATGTGAC CAGTGTGCTC CCTACCACTG GAAGCTGGCC 1500 

AGTGGCCAGG GCTGTGAACC GTGTGCCTGC GACCCGCACA ACTCCCCTCA GCCCACAGTG 1560 

CAACCAGTTC ACAGGGCAGT GCCCTGTCGG GAAGGCTTTG GTGGCCTGAT GTGCAGCGCT 1620 

GCAGCCATCC GCCAGTGTCC AGACCGGACC TATGGAGACG TGGCCACAGG ATGCCGAGCC 1680 

TGTGACTGTG ATTTCCGGGG AACAGAGGGC CCGGGCTGCG ACAAGGCATC AGGCCGCTGC 1740 

CTCTGCCGCC CTGGCTTGAC CGGGCCCCGC TGTGACCAGT GCCAGCGAGG CTACTGCAAT 1800 

CGCTACCCGG TGTGCGTGGC CTGCCACCCT TGCTTCCAGA CCTATGATGC GGACCTCCGG 1860 

GAGCAGGCCC TGCGCTTTGG TAGACTCCGC AATGCCACCG CCAGCCTGTG GTCAGGGCCT 1920 

GGGCTGGAGG ACCGTGGCCT GGCCTCCCGG ATCCTAGATG CAAAGAGTAA GATTGAGCAG 1980 

ATCCGAGCAG TTCTCAGCAG CCCCGCAGTC ACAGAGCAGG AGGTGGCTCA GGTGGCCAGT 2040 

GCCATCCTCT CCCTCAGGCG AACTCTCCAG GGCCTGCAGC TGGATCTGCC CCTGGAGGAG 2100 

GAGACGTTGT CCCTTCCGAG AGACCTGGAG AGTCTTGACA GAAGCTTCAA TGGTCTCCTT 2160 

ACTATGTATC AGAGGAAGAG GGAGCAGTTT GAAAAAATAA GCAGTGCTGA TCCTTCAGGA 2220 

GCCTTCCGGA TGCTGAGCAC AGCCTACGAG CAGTCAGCCC AGGCTGCTCA GCAGGTCTCC 2280 

GACAGCTCGC GCCTTTTGGA CCAGCTCAGG GACAGCCGGA GAGAGGCAGA GAGGCTGGTG 2340 

CGGCAGGCGG GAGGAGGAGG AGGCACCGGC AGCCCCAAGC TTGTGGCCCT GAGGCTGGAG 2400 

ATGTCTTCGT TGCCTGACCT GACACCCACC TTCAACAAGC TCTGTGGCAA CTCCAGGCAG 2460 

ATGGCTTGCA CCCCAATATC ATGCCCTGGT GAGCTATGTC CCCAAGACAA TGGCACAGCC 2520 

TGTGGCTCCC GCTGCAGGGG TGTCCTTCCC AGGGCCGGTG GGGCCTTCTT GATGGCGGGG 2580 

CAGGTGGCTG AGCAGCTGCG GGGCTTCAAT GCCCAGCTCC AGCGGACCAG GCAGATGATT 2640 

AGGGCAGCCG AGGAATCTGC CTCACAGATT CAATCCAGTG CCCAGCGCTT GGAGACCCAG 2700 

GTGAGCGCCA GCCGCTCCCA GATGGAGGAA GATGTCAGAC GCACACGGCT CCTAATCCAG 2760 

CAGGTCCGGG ACTTCCTAAC AGACCCCGAC ACTGATGCAG CCACTATCCA GGAGGTCAGC 2820 

GAGGCCGTGC TGGCCCTGTG GCTGCCCACA GACTCAGCTA CTGTTCTGCA GAAGATGAAT 2880 

GAGATCCAGG CCATTGCAGC CAGGCTCCCC AACGTGGACT TGGTGCTGTC CCAGACCAAG 2940 

CAGGACATTG CGCGTGCCCG CCGGTTGCAG GCTGAGGCTG AGGAAGCCAG GAGCCGAGCC 3000 

CATGCAGTGG AGGGCCAGGT GGAAGATGTG GTTGGGAACC TGCGGCAGGG GACAGTGGCA 3060 

CTGCAGGAAG CTCAGGACAC CATGCAAGGC ACCAGCCGCT CCCTTCGGCT TATCCAGGAC 3120 

AGGGTTGCTG AGGTTCAGCA GGTACTGCGG CCAGCAGAAA AGCTGGTGAC AAGCATGACC 3180 

AAGCAGCTGG GTGACTTCTG GACACGGATG GAGGAGCTCC GCCACCAAGC CCGGCAGCAG 3240 

GGGGCAGAGG CAGTCCAGGC CCAGCAGCTT GCGGAAGGTG CCAGCGAGCA GGCATTGAGT 3300 

GCCCAAGAGG GATTTGAGAG AATAAAACAA AAGTATGCTG AGTTGAAGGA CCGGTTGGGT 3360 
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CAGAGTTCCA TGCTGGGTGA GCAGGGTGCC CGGATCCAGA GTGTGAAGAC AGAGGCAGAG 3420 

GAGCTGTTTG GGGAGACCAT GGAGATGATG GACAGGATGA AAGACATGGA GTTGGAGCTG 3480 

CTGCGGGGCA GCCAGGCCAT CATGCTGCGC TCGGCGGACC TGACAGGACT GGAGAAGCGT 3540 

GTGGAGCAGA TCCGTGACCA CATCAATGGG CGCGTGCTCT ACTATGCCAC CTGCAAGTGA 3600 

TGCTACAGCT TCCAGCCCGT TGCCCCACTC ATCTGCCGCC TTTGCTTTTG GTTGGGGGCA 3660 

GATTGGGTTG GAATGCTTTC CATCTCCAGG AGACTTTCAT GCAGCCTAAA GTACAGCCTG 3720 

GACCACCCCT GGTGTGTAGC TAGTAAGATT ACCCTGAGCT GCAGCTGAGC CTGAGCCAAT 3780 

GGGACAGTTA CACTTGACAG ACAAAGATGG TGGAGATTGG CATGCCATTG AAACTAAGAG 3840 

CTCTCAAGTC AAGGAAGCTG GGCTGGGCAG TATCCCCCGC CTTTAGTTCT CCACTGGGGA 3900 

GGAATCCTGG ACCAAGCACA AAAACTTAAC AAAAGTGATG TAAAAATGAA AAGCCAAATA 3960 

AAAATcrnra a 

Seq ID NO: 20 Protein sequence: 
Protein Accession ft: NP_000219 

1 11 21 31 41 51 

I I I I I I 

MRPFFLLCPA LPGLLRAQQA CSRGACYPPV GDLLVGRTRF LRASSTCGLT KPETYCTQYG 60 

EWQMKCCKCD SRQPHNYYSH RVENVASSSG PMRWWQSQND VNPVSLQLDL DRRFQLQEVM 120 

MEFQGPMPAG MLIERSSDFG KTWRVYQYIiA ADCTSTFPRV RQGRPQSWQD VROQSLPQRP 180 

NARIiNGGKVQ LNLMDLVSGI PATQSQKIQE VGEITNLRVN FTRLAPVPQR GYHPPSAYYA 240 

VSQLRIjQGSC FCHGHADRCA pkpgasagps TAVQVHDVCV CQHNTAGPNC ERCAPFYNNR 300 

PWRPAEGQDA HECQRCDCNG HSETCHFDPA VFAASQGAYG GVCDNCRDHT EGKNCERCQL 3 GO 

HYFRNRRPGA SIQETCISCE CDPDGAVPGA PCDPVTGQCV CKEHVQGERC DLCKPGFTGL 420 

TYANPQGCHR CDCNILGSRR DMPCDEESGR CLCLPNWGP KCDQCAPYHW KLASGQGCEP 480 

CACDPHNSPQ PTVQPVHRAV PCREGFGGLM CSAAAIRQCP DR7YGDVATG CRACDCDFRG 540 

TEGPGCDKAS GRCLCRPGLT GPRCDQCQRG YCNRYPVCVA CHPCPQTYDA DLREQALRFG 600 

RLRNATASLW SGPGLEDRGL ASRILDAKSK IEQIRAVLSS PAVTEQEVAQ VASAILSLRR 660 

TLQGLQLDLP LEEETLSLPR DLESLDRSFN GIiLTMYQRKR EQFEKISSAD PSGAFRMLST 720 

AYEQSAQAAQ QVSDSSRLLD QLRDSRREAE RLVRQAGGGG GTGSPKLVAL RLEMSSLPDL 780 

TPTFNKLCGN SRQMACTPIS CPGELCPQDN GTACGSRCRG VLPRAGGAFL MAGQVAEQLR 840 

GFNAQLQRTR QMIRAAEESA SQIQSSAQRL ETQVSASRSQ MEEDVRRTRL LIQQVRDFLT 900 

DPDTDAATIQ EVSEAVLALW LPTDSATVLQ KMNEIQAIAA RLPNVDLVLS QTKQDIARAR 960 

RLQAEAEEAR SRAHAVEGQV EDWGNLRQG TVAiiQEAQDT MQGTSRSLRIi IQDRVAEVQQ 1020 

VLRPAEKLVT SMTKQLGDFW TRMEELRHQA RQQGAEAVQA QQLAEGASEQ ALSAQEGFER 1080 

IKQKYAELKD RLGQSSMLGE QGARIQSVKT EAEELFGETM EMMDRMKDME LELLRGSQAI 1140 
MLRSADLTGL EKRVEQIRDH INGRVLYYAT CK 

Seq ID NO: 21 DNA sequence 
Nucleic Acid Accession #: NM_003722 
Coding sequence: 145-1491 

1 11 21 31 41 51 

I I I I I I 

TCGTTGATAT CAAAGACAGT TGAAGGAAAT GAATTTTGAA ACTTCACGGT GTGCCACCCT 60 

ACAGTACTGC CCTGACCCTT ACATCCAGCG TTTCGTAGAA ACCCAGCTCA TTTCTCTTGG 120 

AAAGAAAGTT ATTACCGATC CACCATGTCC CAGAGCACAC AGACAAATGA ATTCCTCAGT 180 

CCAGAGGTTT TCCAGCATAT CTGGGATTTT CTGGAACAGC CTATATGTTC AGTTCAGCCC 240 

ATTGACTTGA ACTTTGTGGA TGAACCATCA GAAGATGGTG CGACAAACAA GATTGAGATT 300 

AGCATGGACT GTATCCGCAT GCAGGACTCG GACCTGAGTG ACCCCATGTG GCCACAGTAC 360 

ACGAACCTGG GGCTCCTGAA CAGCATGGAC CAGCAGATTC AGAACGGCTC CTCGTCCACC 420 

AGTCCCTATA ACACAGACCA CGCGCAGAAC AGCGTCACGG CGCCCTCGCC CTACGCACAG 480 

CCCAGCTCCA CCTTCGATGC TCTCTCTCCA TCACCCGCCA TCCCCTCCAA CACCGACTAC 540 

CCAGGCCCGC ACAGTTTCGA CGTGTCCTTC CAGCAGTCGA GCACCGCCAA GTCGGCCACC 600 

TGGACGTATT CCACTGAACT GAAGAAACTC TACTGCCAAA TTGCAAAGAC ATGCCCCATC 660 

CAGATCAAGG TGATGACCCC ACCTCCTCAG GGAGCTGTTA TCCGCGCCAT GCCTGTCTAC 720 

AAAAAAGCTG AGCACGTCAC GGAGGTGGTG AAGCGGTGCC CCAACCATGA GCTGAGCCGT 780 
GAATTCAACG AGGGACAGAT TGCCCCTCCT AGTCATTTGA TTCGAGTAGA GGGGAACAGC 840 
CATGCCCAGT ATGTAGAAGA TCCCATCACA GGAAGACAGA GTGTGCTGGT ACCTTATGAG 900 
CCACCCCAGG TTGGCACTGA ATTCACGACA GTCTTGTACA ATTTCATGTG TAACAGCAGT 960 

TGTGTTGGAG GGATGAACCG CCGTCCAATT TTAATCATTG TTACTCTGGA AACCAGAGAT 1020 

GGGCAAGTCC TGGGCCGACG CTGCTTTGAG GCCCGGATCT GTGCTTGCCC AGGAAGAGAC 1080 

AGGAAGGCGG ATGAAGATAG CATCAGAAAG CAGCAAGTTT CGGACAGTAC AAAGAACGGT 1140 

GATGGTACGA AGCGCCCGTT TCGTCAGAAC ACACATGGTA TCCAGATGAC ATCCATCAAG 1200 

AAACGAAGAT CCCCAGATGA TGAACTGTTA TACTTACCAG TGAGGGGCCG TGAGACTTAT 1260 

GAAATGCTGT TGAAGATCAA AGAGTCCCTG GAACTCATGC AGTACCTTCC TCAGCACACA 1320 

ATTGAAACGT ACAGGCAACA GCAACAGCAG CAGCACCAGC ACTTACTTCA GAAACATCTC 1380 

CTTTCAGCCT GCTTCAGGAA TGAGCTTGTG GAGCCCCGGA GAGAAACTCC AAAACAATCT 1440 

GACGTCTTCT TTAGACATTC CAAGCCCCCA AACCGATCAG TGTACCCATA GAGCCCTATC 1500 

TCTATATTTT AAGTGTGTGT GTTGTATTTC CATGTGTATA TGTGAGTGTG TGTGTGTGTA 1560 

TGTGTGTGCG TGTGTATCTA GCCCTCATAA ACAGGACTTG AAGACACTTT GGCTCAGAGA 1620 

CCCAACTGCT CAAAGGCACA AAGCCACTAG TGAGAGAATC TTTTGAAGGG ACTCAAACCT 1680 

TTACAAGAAA GGATGTTTTC TGCAGATTTT GTATCCTTAG ACCGGCCATT GGTGGGTGAG 1740 

GAACCACTGT GTTTGTCTGT GAGCTTTCTG TTGTTTCCTG GGAGGGAGGG GTCAGGTGGG 1800 

GAAAGGGGCA TTAAGATGTT TATTGGAACC CTTTTCTGTC TTCTTCTGTT GTTTTTCTAA 1860 

AATTCACAGG GAAGCTTTTG AGCAGGTCTC AAACTTAAGA TGTCTTTTTA AGAAAAGGAG 1920 

AAAAAAGTTG TTATTGTCTG TGCATAAGTA AGTTGTAGGT GACTGAGAGA CTCAGTCAGA 1980 

CCCTTTTAAT GCTGGTCATG TAATAATATT GCAAGTAGTA AGAAACGAAG GTGTCAAGTG 2040 

TACTGCTGGG CAGCGAGGTG ATCATTACCA AAAGTAATCA ACTTTGTGGG TGGAGAGTTC 2100 

TTTGTGAGAA CTTGCATTAT TTGTGTCCTC CCCTCATGTG TAGGTAGAAC ATTTCTTAAT 2160 

GCTGTGTACC TGCCTCTGCC ACTGTATGTT GGCATCTGTT ATGCTAAAGT TTTTCTTGTA 2220 

CATGAAACCC TGGAAGACCT ACTACAAAAA AACTGTTGTT TGGCCCCCAT AGCAGGTGAA 2280 

CTCATTTTGT GCTTTTAATA GAAAGACAAA TCCACCCCAG TAATATTGCC CTTACGTAGT 2340 

TGTTTACCAT TATTCAAAGC TCAAAATAGA ATTTGAAGCC CTCTCACAAA ATCTGTGATT 2400 

AATTTGCTTA ATTAGAGCTT CTATCCCTCA AGCCTACCTA CCATAAAACC AGCCATATTA 2460 

CTGATACTGT TCAGTGCATT TAGCCAGGAG ACTTACGTTT TGAGTAAGTG AGATCCAAGC 2520 

AGACGTGTTA AAATCAGCAC TCCTGGACTG GAAATTAAAG ATTGAAAGGG TAGACTACTT 2580 
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TTCTTTTTTT TACTCAAAAG TTTAGAGAAT CTCTGTTTCT TTCCATTTTA AAAACATATT 2640 
TTAAQATAAT A6CATAAAGA CTTTAAAAAT GTTCCTCCCC TCCATCTTCC CACACCCAGT 2700 
CACCAGCACT GTATTTTCTG TCACCAAGAC AATGATTTCT TGTTATTGAG GCTGTTGCTT 2760 
TTGTGGATGT GTGATTTTAA TTTTCAATAA ACTTTTGCAT CTTGGTTTAA AAGAAA 

Seq ID NO: 22 Protein Bequencet 
Protein Accession #: NP_003713 

1 11 21 31 

I I I I 

MSQSTQTNEF LSPEVFQHIW DFLEQPICSV QPIDLNFVDE 
DSDLSDPMWP QYTNLGLLNS MDQQIQNGSS STSPYNTDHA 
SPSPAIPSNT DYPGPHSFDV SFQQSSTAKS ATWTYSTEliK 
PQGAVIRAMP VYKKAEHVTB WKRCPNHEL SREFNEGQIA 
ITGRQSVLVP YEPPQVGTEF TTVLYNFMCN SSCVGGMNRR 
FEARICACPG RDRKADEDSI RKQQVSDSTK NGDGTKRPPR 
LLYLPVRGRE TYEMLLKIKE SLELMQYLPQ HTIETYRQQQ 
LVEPRRETPK QSDVFFRHSK PPNRSVYP 



Seq ID NOs 23 DNA sequence 
Nucleic Acid Accession NM_001944.1 
Coding sequence: 84-3083 

1 11 21 31 * 41 51 

I I I I I I 

TTTTCTTAGA CATTAACTGC AGACGGCTGG CAGGATAGAA GCAGCGGCTC ACTTGGACTT 60 

TTTCACCAGG GAAATCAGAG ACAATGATGG GGCTCTTCCC CAGAACTACA GGGGCTCTGG 120 

CCATCTTCGT GGTGGTCATA TTGGTTCATG GAGAATTGCG AATAGAGACT AAAGGTCAAT 180 

ATGATGAAGA AGAGATGACT ATGCAACAAG CTAAAAGAAG GCAAAAACGT GAATGGGTGA 240 

AATTTGCCAA ACCCTGCAGA GAAGGAGAAG ATAACTCAAA AAGAAACCCA ATTGCCAAGA 300 

TTACTTCAGA TTACCAAGCA ACCCAGAAAA TCACCTACCG AATCTCTGGA GTGGGAATCG 360 

ATCAGCCGCC TTTTGGAATC TTTGTTGTTG ACAAAAACAC TGGAGATATT AACATAACAG 420 

CTATAGTCGA CCGGGAGGAA ACTCCAAGCT TCCTGATCAC ATGTCGGGCT CTAAATGCCC 480 

AAGGACTAGA TGTAGAGAAA CCACTTATAC TAACGGTTAA AATTTTGGAT ATTAATGATA S40 

ATCCTCCAGT ATTTTCACAA CAAATTTTCA TGGGTGAAAT TGAAGAAAAT AGTGCCTCAA 600 

ACTCACTGGT GATGATACTA AATGCCACAG ATGCAGATGA ACCAAACCAC TTGAATTCTA 660 

AAATTGCCTT CAAAATTGTC TCTCAGGAAC CAGCAGGCAC ACCCATGTTC CTCCTAAGCA 720 

GAAACACTGG GGAAGTCCGT ACTTTGACCA ATTCTCTTGA CCGAGAGCAA GCTAGCAGCT 780 

ATCGTCTGGT TGTGAGTGGT GCAGACAAAG ATGGAGAAGG ACTATCAACT CAATGTGAAT 840 

GTAATATTAA AGTGAAAGAT GTCAACGATA ACTTCCCAAT GTTTAGAGAC TCTCAGTATT 900 

CAGCACGTAT TGAAGAAAAT ATTTTAAGTT CTGAATTACT TCX3ATTTCAA GTAACAGATT 960 

TGGATGAAGA GTACACAGAT AATTGGCTTG CAGTATATTT CTTTACCTCT GGGAATGAAG 1020 

GAAATTGGTT TGAAATACAA ACTGATCCTA GAACTAATGA AGGCATCCTG AAAGTGGTGA 1080 

AGGCTCTAGA TTATGAACAA CTACAAAGCG TGAAACTTAG TATTGCTGTC AAAAACAAAG 1140 

CTGAATTTCA CCAATCAGTT ATCTCTCGAT ACCGAGTTCA GTCAACCCCA GTCACAATTC 1200 

AGGTAATAAA TGTAAGAGAA GGAATTGCAT TCCGTCCTGC TTCCAAGACA TTTACTGTGC 1260 

AAAAAGGCAT AAGTAGCAAA AAATTGGTGG ATTATATCCT GGGAACATAT CAAGCCATCG 1320 

ATGAGGACAC TAACAAAGCT GCCTCAAATG TCAAATATGT CATGGGACGT AACGATGGTG 1380 

GATACCTAAT GATTGATTCA AAAACTGCTG AAATCAAATT TGTCAAAAAT ATGAACCGAG 1440 

ATTCTACTTT CATAGTTAAC AAAACAATCA CAGCTGAGGT TCTGGCCATA GATGAATACA 1500 

CGGGTAAAAC TTCTACAGGC ACGGTATATG TTAGAGTACC CGATTTCAAT GACAATTGTC 1560 

CAACAGCTGT CCTCGAAAAA GATGCAGTTT GCAGTTCTTC ACCTTCCGTG GTTGTCTCCG 1620 

CTAGAACACT GAATAATAGA TACACTGGCC CCTATACATT TGCACTGGAA GATCAACCTG 1680 

TAAAGTTGCC TGCCGTATGG AGTATCACAA CCCTCAATGC TACCTCGGCC CTCCTCAGAG 1740 

CCCAGGAACA GATACCTCCT GGAGTATACC ACATCTCCCT GGTACTTACA GACAGTCAGA 1800 

ACAATCGGTG TGAGATGCCA CGCAGCTTGA CACTGGAAGT CTGTCAGTGT GACAACAGGG 1860 

GCATCTGTGG AACTTCTTAC CCAACCACAA GCCCTGGGAC CAGGTATGGC AGGCCGCACT 1920 

CAGGGAGGCT GGGGCCTGCC GCCATCGGCC TGCTGCTCCT TGGTCTCCTG CTGCTGCTGT 1980 

TGGCCCCCCT TCTGCTGTTG ACCTGTGACT GTGGGGCAGG TTCTACTGGG GGAGTGACAG 2040 

GTGGTTTTAT CCCAGTTCCT GATGGCTCAG AAGGAACAAT TCATCAGTGG GGAATTGAAG 2100 

GAGCCCATCC TGAAGACAAG GAAATCACAA ATATTTGTGT GCCTCCTGTA ACAGCCAATG 2160 

GAGCCGATTT CATGGAAAGT TCTGAAGTTT GTACAAATAC GTATGCCAGA GGCACAGCGG 2220 

TGGAAGGCAC TTCAGGAATG GAAATGACCA CTAAGCTTGG AGCAGCCACT GAATCTGGAG 2280 

GTGCTGCAGG CTTTGCAACA GGGACAGTGT CAGGAGCTGC TTCAGGATTC GGAGCAGCCA 2340 

CTGGAGTTGG CATCTGTTCC TCAGGGCAGT CTGGAACCAT GAGAACAAGG CATTCCACTG 2400 

GAGGAACCAA TAAGGACTAC GCTGATGGGG CGATAAGCAT GAATTTTCTG GACTCCTACT 2460 

TTTCTCAGAA AGCATTTGCC TGTGCGGAGG AAGACGATGG CCAGGAAGCA AATGACTGCT 2520 

TGTTGATCTA TGATAATGAA GGCGCAGATG CCACTGGTTC TCCTGTGGGC TCCGTGGGTT 2580 

GTTGCAGTTT TATTGCTGAT GACCTGGATG ACAGCTTCTT GGACTCACTT GGACCCAAAT 2640 

TTAAAAAACT TGCAGAGATA AGCCTTGGTG TTGATGGTGA AGGCAAAGAA GTTCAGCCAC 2700 

CCTCTAAAGA CAGCGGTTAT GGGATTGAAT CCTGTGGCCA TCCCATAGAA GTCCAGCAGA 2760 

CAGGATTTGT TAAGTGCCAG ACTTTGTCAG GAAGTCAAGG AGCTTCTGCT TTGTCCGCCT 2820 

CTGGGTCTGT CCAGCCAGCT GTTTCCATCC CTGACCCTCT GCAGCATGGT AACTATTTAG 2880 

TAACGGAGAC TTACTCGGCT TCTGGTTCCC TCGTGCAACC TTCCACTGCA GGCTTTGATC 2940 

CACTTCTCAC ACAAAATGTG ATAGTGACAG AAAGGGTGAT CTGTCCCATT TCCAGTGTTC 3000 

CTGGCAACCT AGCTGGCCCA ACGCAGCTAC GAGGGTCACA TACTATGCTC TGTACAGAGG 3060 

ATCCTTGCTC CCGTCTAATA TGACCAGAAT GAGCTGGAAT ACCACACTGA CCAAATCTGG 3120 

ATCTTTGGAC TAAAGTATTC AAAATAGCAT AGCAAAGCTC ACTGTATTGG GCTAATAATT 3180 

TGGCACTTAT TAGCTTCTCT CATAAACTGA TCACGATTAT AAATTAAATG TTTGGGTTCA 3240 

TACCCCAAAA GCAATATGTT GTCACTCCTA ATTCTCAAGT ACTATTCAAA TTGTAGTAAA 3300 
TCTTAAAGTT TTTCAAAACC CTAAAATCAT ATTCGC 

Seq ID NO j 24 Protein sequence t 
Protein Accession #: NPJ>01935.1 

1 11 21 31 41 51 



41 51 

I i 

PSEDGATNKI EISMDCIRMQ 60 

QNSVTAPSPY AQPSSTFDAL 120 

KLYCQIAKTC PIQIKVMTPP 180 

PPSHLIRVEG NSHAQYVEDP 240 

PILIIVTLET RDGQVLGRRC 300 

QNTHGIQMTS IKKRRSPDDE 360 

QQQHQHLLQK HLLSACFRNE 420 
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i r i i i i 

MMGLFPRTTG ALAIFWVIL VHGELRIETK GQYDEBEMTM QQAKRRQKRS WVKPAKPCRE 60 

GEDNSKRNPI AKITSDYQAT QKITYRISGV GIDQPPFGIF WDKNTGDIN ITAIVDREET 120 

PSFLITCRAL NAQGIoDVEKP LILTVKILDI NDNPPVFSQQ IFMGEIEENS ASNSLVMILN 180 

ATDADEPNHL NSKIAFKXVS QEPAGTPMFL LSRNTGEVRT LTNSLDREQA SSYRLWSGA 240 

DKDGEGLSTQ CECNIKVKDV NDNPPMFRDS QYSARIEENI LSSELLRFQV TDLDEEYTDN 300 

WLAVYFFTSG NEGNWFEIQT DPRTNEGILK WKALDYEQL QSVKLSIAVK NKAEFHQSVI 360 

SRYRVQSTPV TIQVINVREG IAFRPASKTF TVQKGISSKK LVDYILGTYQ AIDEDTNKAA 420 

SNVKYVMGRN DGGYLMIDSK TAEIKFVKNM NRDSTFIVNK TITAEVLAID EYTGKTSTGT 480 

VYVRVPDFND NCPTAVLEKD AVCSSSPSW VSARTLNNRY TGPYT PALED QPVKLPAVWS 540 

ITTLNATSAIi LRAQEQIPPG VYHISLVLTD SQNNRCEMPR SLTLEVCQCD NRGIOGTSYP 600 

TTSPGTRYGR PHSGRLGPAA IGLLLLGLLL LLLAPLLLLT CDOGAGSTGG VTGGFIPVPD 660 

GSEGTIHQWG IEGAHPEDKE ITNICVPPVT ANGADFMESS EVCTNTYARG TAVEGTSGME 720 

MTTKLGAATE SGGAAGFATG TVSGAASGFG AATGVGICSS GQSGTMRTRH STGGTNKDYA 780 

DGAISMNFLD SYPSQKAPAC AEEDDGQEAN DCLLIYDNEG ADATGSPVGS VGCCSFIADD 840 

LDDSFLDSLG PKFKKLAEIS LGVDGEGKEV QPPSKDSGYG IESCGHPIEV QQTGFVKCQT 900 

LSGSQGASAL SASGSVQPAV SIPDPLQHGN YLVTETYSAS GSLVQPSTAG FDPLLTQNVI 960 
VTERVICPIS SVPGNLAGPT QLRGSHTMLC TEDPCSRLI 

Seq ID NO: 25 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 56-1642 

1 11 . 21 31 41 51 

I I I I I I 

AGTATCCCAG GAGGAGCAAG TGGCACGTCT TCGGACCTAG GCTGCCCCTG CCGTCATGTC 60 

GCAAGGGATC CTTTCTCCGC CAGCGGGCTT GCTGTCCGAT GACGATGTCG TAGTTTCTCC 120 

CATGTTTQAG TCCACAGCTG CAGATTTGGG GTCTGTGGTA CGCAAGAACC TGCTATCAGA 180 

CTGCTCTGTC GTCTCTACCT CCCTAGAGGA CAAGCAGCAG GTTCCATCTG AGGACAGTAT 240 

GGAGAAGGTG AAAGTATACT TGAGGGTTAG GCCCTTGTTA CCTTCAGAGT TGGAACGACA 300 

GGAAGATCAG GGTTGTGTCC GTATTGAGAA TGTGGAGACC CTTGTTCTAC AAGCACCCAA 360 

GGACTCTTTT GCCCTGAAGA GCAATGAACG GGGAATTGGC CAAGCCACAC ACAGGTTCAC 420 

CTTTTCCCAG ATCTTTGGGC CAGAAGTGGG ACAGGCATCC TTCTTCAACC TAACTGTGAA 480 

GGAGATGGTA AAGGATGTAC TCAAAGGGCA GAACTGGCTC ATCTATACAT ATGGAGTCAC 540 

TAACTCAGGG AAAACCCACA CGATTCAAGG TACCATCAAG GATGGAGGGA TTCTCCCCCG 600 

' GTCCCTGGCG CTGATCTTCA ATAGCCTCCA AGGCCAACTT CATCCAACAC CTGATCTGAA 660 

GCCCTTGCTC TCCAATGAGG TAATCTGGCT AGACAGCAAG CAGATCCGAC AGGAGGAAAT 720 

GAAGAAGCTG TCCCTGCTAA ATGGAGGCCT CCAAGAGGAG GAGCTGTCCA CTTCCTTGAA 780 

GAGGAGTGTC TACATCGAAA GTCGGATAGG TACCAGCACC AGCTTCGACA GTGGCATTGC 840 

TGGGCTCTCT TCTATCAGTC AGTGTACCAG CAGTAGCCAG CTGGATGAAA CAAGTCATCG 900 

ATGGGCACAG CCAGACACTG CCCCACTACC TGTCCCGGCA AACATTCGCT TCTCCATCTG 960 

GATCTCATTC TTTGAGATCT ACAACGAACT GCTTTATGAC CTATTAGAAC CGCCTAGCCA 1020 

ACAGCGCAAG AGGCAGACTT TGCGGCTATG CGAGGATCAA AATGGCAATC CCTATGTGAA 1080 

AGATCTCAAC TGGATTCATG TGCAAGATGC TGAGGAGGCC TGGAAGCTCC TAAAAGTGGG 1140 

TCGTAAGAAC CAGAGCTTTG CCAGCACCCA CCTCAACCAG AACTCCAGCC GCAGTCACAG 1200 

CATCTTCTCA ATCAGGATCC TACACCTTCA GGGGGAAGGA GATATAGTCC CCAAGATCAG 1260 

. CGAGCTGTCA CTCTGTGATC TGGCTGGCTC AGAGCGCTGC AAAGATCAGA AGAGTGGTGA 1320 

ACGGTTGAAG GAAGCAGGAA ACATTAACAC CTCTCTACAC ACCCTGGGCC GCTGTATTGC 1380 

TGCCCTTCGT CAAAACCAGC AGAACCGGTC AAAGCAGAAC CTGGTTCCCT TCCGTGACAG 1440 

CAAGTTGACT CGAGTGTTCC AAGGTTTCTT CACAGGCCGA GGCCGTTCCT GCATGATTGT 1500 

CAATGTGAAT CCCTGTGCAT CTACCTATGA TGAAACTCTT CATGTGGCCA AGTTCTCAGC 1560 

CATTGCTAGC CAGGTGACTT GTGCATGCCC CACCTATGCA ACTGGGATTC CCATCCCTGC 1620 

ACTCGTTCAT CAAGGAACAT AGTCTTCAGG TATCCCCCAG CTTAGAGAAA GGGGCTAAGG 1680 

CAGACACAGG CCTTGATGAT GATATTGAAA ATGAAGCTGA CATCTCCATG TATGGCAAAG 1740 

AGGAGCTCCT ACAAGTTGTG GAAGCCATGA AGACACTGCT TTTGAAGGAA CGACAGGAAA 1800 

AGCTACAGCT GGAGATGCAT CTCCGAGATG AAATTTGCAA TGAGATGGTA GAACAGATGC 1860 

AACAGCGGGA ACAGTGGTGC AGTGAACATT TGGACACCCA AAAGGAACTA TTGGAGGAAA 1920 

TGTATGAAGA AAAACTAAAT ATCCTCAAGG AGTCACTGAC AAGTTTTTAC CAAGAAGAGA 1980 

TTCAGGAGCG GGATGAAAAG ATTGAAGAGC TAGAAGCTCT CTTGCAGGAA GCCAGACAAC 2040 

AGTCAGTGGC CCATCAGCAA TCAGGGTCTG AATTGGCCCT ACGGCGGTCA CAAAGGTTGG 2100 

CAGCTTCTGC CTCCACCCAG CAGCTTCAGG AGGTTAAAGC TAAATTACAG CAGTGCAAAG 2160 

CAGAGCTAAA CTCTACCACT GAAGAGTTGC ATAAGTATCA GAAAATGTTA GAACCACCAC 2220 

CCTCAGCCAA GCCCTTCACC ATTGATGTGG ACAAGAAGTT AGAAGAGGGC CAGAAGAATA 2280 

TAAGGCTGTT GCGGACAGAG CTTCAGAAAC TTGGTGAGTC TCTCCAATCA GCAGAGAGAG 2340 

CTTGTTGCCA CAGCACTGGG GCAGGAAAAC TTCGTCAAGC CTTGACCACT TGTGATGACA 2400 

TCTTAATCAA ACAGGACCAG ACTCTGGCTG AACTGCAGAA CAACATGGTG CTAGTGAAAC 2460 

TGGACCTTCG GAAGAAGGCA GCATGTATTG CTGAGCAGTA TCATACTGTG TTGAAACTCC 2520 

AAGGCCAGGT TTCTGCCAAA AAGCGCCTTG GTACCAACCA GGAAAATCAG CAACCAAACC 2580 

AACAACCACC AGGGAAGAAA CCATTCCTTC GAAATTTACT TCCCCGAACA CCAACCTGCC 2640 

AAAGCTCAAC AGACTGCAGC CCTTATGCCC GGATCCTACG CTCACGGCGT TCCCCTTTAC 2700 

TCAAATCTGG GCCTTTTGGC AAAAAGTACT AAGGCTGTGG GGAAAGAGAA GAGCAGTCAT 2760 

GGCCCTGAGG TGGGTCAGCT ACTCTCCTGA AGAAATAGGT CTCTTTTATG CTTTACCATA 2820 

TATCAGGAAT TATATCCAGG ATGCAATACT CAGACACTAG CTTTTTTCTC ACTTTTGTAT 2880 

TATAACCACC TATGTAATCT CATGTTGTTG TTTTTTTTTA TTTACTTATA TGATTTCTAT 2940 

GCACACAAAA ACAGTTATAT TAAAGATATT ATTGTTCACA TTTTTTATTG AATTCCAAAT 3000 
GTAGCAAAAT CATTAAAACA AATTATAAAA GGGACAGAAA AA 

Seq ID NO: 26 Protein sequence: 
Protein Accession # : Eos sequence 

1 11 21 31 

1111 
MSQGILSPPA GLLSDDDVW SPMFESTAAD LGSWRKNLL 
SMEKVKVYLR VRPLLPSELE RQEDQGCVRI ENVETLVLQA 
FTFSQIFGPE VGQASFFNLT VKEMVKDVLK GQNWLIYTYG 
PRSLALIFNS LQGQLHPTPD LKPLLSNEVI WLDSKQIRQE 
LKRSVYIESR IGTSTSFDSG IAGLSSISQC TSSSQLDETS 
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41 51 
I I 

SDCSWSTSL EDKQQVPSED 60 

PXDSFALKSN ERGIGQATHR 120 

VTNSGKTHTI QGTIKDGGIL 180 

EMKKLSLLNG GLQEEELSTS 240 

HRWAQPDTAP LPVPANIRFS 300 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 
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IWISFFEIYN ELLYDLLEPP SQQRKRQTLR LCEDONGNPY VKDLNWIHVQ DAEEAWKLLK 
VGRKNQSFAS THLNQNSSRS HSIFSIRILH LQGEGDTVPK ISELSLCDLA GSERCKDQKS 
GERLKEAGNI NTSLHTLGRC IAALRQNQQN RSKQNLVPFR DSKLTRVFQO FFTGRGRSCM 
IVNVNPCAST YDETLHVAXF SAIASQVTCA CPTYATGIPI PALVHQGT 

Seq ID NO: 27 DNA sequence 

Nucleic Acid Accession 8: Eos sequence 

Coding sequence: 13-1424 



PCT/US02/12476 



360 
420 
480 



1 
I 

TAGAAGTTTA 
CTTCCCCTGA 
TTAGAAAAAT 
GGAAACTTAA 
GGGCAACTGG 
GTCCATCATT 
AGAATCAATA 
GCTTTCCAAG 
GCTGACATTT 
AAAGGTGGAA 
TTCGATGAGG 
GTTCACGAGA 
TTCCCCACCT 
GGCATTCAGT 
TCAGAACCAG 
AAGATCTTTT 
AGTGTTAATT 
GAAATTG AAG 
AATTTAAGAC 
GTGAAAAAAA 
GATAACCAGT 
CTGATTACCA 
AACAAATACT 
CGTATCACCA 
TGGTTTTTGT 
GTGTACCACT 
TTATATAAAA 
CTCTACTATT 
CTCTGTAAGT 
TAAAATTAAG 



11 
I 

CAATGAAGTT 
ACAGCTCTAC 
TTTATGGCCT 
TGAAGGAAAA 
ACACATCTAC 
TCAGGGAAAT 
ATTACACACC 
TATGGAGTAA 
TGGTGGTTTT 
TCCTAGCCCA 
ACGAATTCTG 
TTGGCCATTC 
ACAAATATGT 
CCCTGTATGG 
CTCTCTGTGA 
TCTTCAAAGA 
TAATTTCTTC 
CCAGAAATCA 
CAGAGCCAAA 
TTGATGCAGC 
ATTGGAGGTA 
AGAACTTCCA 
ACTATTTCTT 
AAACACTGAA 
TAGTTCACTT 
ACTTAGAGAT 
TACATAATAT 
AAGTTTGAAA 
TGCTTCCTAA 
TATATATATT 



21 
I 

TCTTCTAATA 
AAGCCTGGAA 
TGAGATAAAC 
AATCCAAGAA 
CCTGGAGATG 
GCCAGGGGGG 
TGACATGAAC 
TGTTACCCCC 
TGCCCGTGGA 
TGCTTTTGGA 
GACTACACAT 
CTTAGGTCTT 
TGACATCAAC 
AGACCCAAAA 
CCCCAATTTG 
CAGGTTCTTC 
CTTATGGCCA 
AGTTTTTCTT 
TTATCCCAAG 
TGTTTTTAAC 
TGATGAAAGG 
AGGAATCGGG 
CCAAGGATCT 
AAGCAATAGC 
CAGCTTAATA 
ATGTATCATA 
TTTT CAATTT 
ATAGTTACCT 
CATCCTTGGA 
TTGGCTCAAA 



31 
I 

CTGCTCCTGC 
AAAAATAATG 
AAACTTCCAG 
ATGCAGCACT 
ATGCACGCAC 
CCCGTATGGA 
CGTGAGGATG 
TTGAAATTCA 
GCTCATGGAG 
CCTGGATCTG 
TCAGGAGGCA 
GGCCATTCTA 
ACATTTCGCC 
GAGAACCAAC 
AGTTTTGATG 
TGGCTGAAGG 
ACCTTGCCAT 
TTTAAAGATG 
AGCATACATT 
CCACGTTTTT 
AGACAGATGA 
CCTAAAATTG 
AACCAATTTG 
TGGTTTGGTT 
AGTATTTATT 
AAAATAAAAT 
TGAAAACTCT 
TCAAAGCAAG 
CTGAGAAATT 
TAAAATTG 



41 

I 

AGGCCACTGC 
TGCTATTTGG 
TGACAAAAAT 
TCTTGGGTCT 
CTCGATGTGG 
GGAAACATTA 
TTGACTACGC 
GCAAGATTAA 
ACTTCCATGC 
GCATTGGAGG 
CAAACTTGTT 
GTGATCCAAA 
TCTCTGCTGA 
GCTTGCCAAA 
CTGTCACTAC 
TTTCTGAGAG 
CTGGCATTGA 
ACAAATACTG 
CTTTTGGTTT 
ATAGGACCTA 
TGGACCCTGG 
ATGCAGTCTT 
AATATGACTT 
GTTGAAAATG 
GCATATTTGC 
CTGTAAACCA 
AATTGTCCAT 
ATAATTCTAT 
ATACTTACTT 



51 
I 

TTCTGGAGCT 
TGAAAGATAC 
GAAATATAGT 
GAAAGTGACC 
AGTCCCCGAT 
TATCACCTAC 
AATCCGGAAA 
CACAGGCATG 
TTTTGATGGC 
GGATGCACAT 
CCTCACTGCT 
GGCCGTAATG 
TGACATACGT 
TCCTGACAAT 
CGTGGGAAAT 
ACCAAAGACC 
AGCTGCTTAT 
GTTAATTAGC 
TCCTAACTTT 
CTTCTTTGTA 
TTATCCCAAA 
CTACTCTAAA 
CCTACTCCAA 
GTGTAATTAA 
TATGTCCTCA 
TAGGTAATGA 
TCTTGCTTGA 
TTGAAGCATG 
CTGGCATAAC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 



Seq ID NO: 28 Protein sequence: 
Protein Accession #: Eos sequence 



MJCFLLILLLQ 
KEKIQEMQHF 
YTPDMNREDV 
LAHAFGPGSG 
KYVDINTFRL 
FKDRFFWLKV 
EPNYPKSIHS 
NFQGIGPKID 



11 
I 

ATASGALPLN 
LGLKVTGQLD 
DYAIRKAFQV 
IGGDAHPDED 
SADDIRGIQS 
SERPKTSVNIi 
FGPPNFVKKI 
AVFYSKNKYY 



21 

I 

SSTSLEKNNV 
TSTLEMMHAP 
WSNVTPIiKPS 
EFWTTHSGGT 
LYGDPKENQR 
ISSLWPTLPS 
DAAVFNPRFY 
YFFQGSNQFE 



31 
I 

LFGERYLEKF 
RCGVPDVHHF 
KINTGMADIL 
NLFLTAVHEI 
LPNPDNSEPA 
GIEAAYEIEA 
RTYFFVDNQY 
YDFLLQRITK 



41 
i 

YGLEINKLPV 
REMPGGPVWR 
WFARGAHGD 
GHSLGLGHSS 
LCDPNLSFDA 
RNQVFLFKDD 
WRYDERRQMM 
TLKSNSWFGC 



51 
I 

TKMKYSGNLM 
KHYITYRINN 
FHAFDGKGGI 
DPKAVMFPTY 
VTTVGNKIFF 
KYWLISNLRP 
DPGYPKLITK 



Seq ID NO: 29 DNA sequence 

Nucleic Acid Accession #: NM_00 6115.1 

Coding sequences 236.. 1765 



GCTTCAGGGT 
CGGGACACCC 
ACTCTCTGAG 
GAGACCTAGA 
ACGAAGGCGT 
CCCACGGAGA 
TGCCGCCCTG 
CGGGAGACAC 
TCTGGGAGTG 
TGGACTTGAT 
GGATTTACGG 
TCTGTACTCA 
TGGTTTGAGC 
CCTCAAGGAA 
GAAAAATGTA 
TATCAAGATG 
TACCTGGAAG 
GCGTAGACTC 
GCAGTATATC 
TGTGGACTCT 
CCCCTTGGAA 
GTCCCAGAGT 
CGATGTAAGT 
CCTGGTCTTT 
GAGCCACTGC 



11 
I 

ACAGCTCCCC 
CACCCGCTTC 
GAAAAACCAT 
AATCCAAGCG 
TTGTGGGGTT 
CTTGTGGAGC 
GAGTTGCTGC 
AGCCAGACCC 
CTGATGAAGG 
GTGCTCCTTG 
AAGAACTCTC 
TTTCCAGAGC 
ACAGAGGCAG 
GGTGCCTGTG 
CTACGCCTGT 
ATCCTGAAAA 
CTACCCACCT 
CTCCTCTCCC 
GCCCAGTTCA 
TTATTTTTCC 
ACCCTCTCAA 
CCCAGCGTCA 
CCCGAGCOCC 
GATGAGTGTG 
TCCCAGCTTA 



21 
I 

CGCAGCCAGA 
CCAGGCGTGA 
TTTGATTATT 
TTGGAGGTCC 
CCATTCAGAG 
TGGCAGGGCA 
CCAGGGAGCT 
TGAAGGCAAT 
GACAACATCT 
CCCAGGAGGT 
ATCAGGACTT 
CAGAAGCAGC 
AGCAGCCCTT 
ATGAATTGTT 
GCTGTAAGAA 
TGGTGCAGCT 
TGGCGAAATT 
ACATCCATGC 
OCTCTCAGTT 
TTAGAGGCCG 
TAACTAACTG 
GTCAGCTAAG 
TCCAAGCTCT 
GGATCACGGA 
CAACCTTAAG 



31 

I 

AGCOGGGCCT 
CCTGTCAACA 
ACTCTCAGAC 
TGAGGCCAGC 
CCGATACATC 
GAGCCTGCTG 
CTTCCCGCCA 
GGTGCAGGCC 
TCACCTGGAG 
TCGCCCCAGG 
CTGGACTGTA 
TCAGCCCATG 
CATTCCAGTA 
CTCCTACCTC 
GCTGAAGATT 
GGACTCTATT 
TTCTCCTTAC 
ATCTTCCTAC 
CCTCAGTCTG 
CCTGGATCAG 
CCGGCTTTCG 
TGTCCTGAGT 
GCTGGAGAGA 
TGATCAGCTC 
CTTCTACGGG 



41 
I 

GCAGCCCCTC 
GCAACTTCGC 
GTGCGTGGCA 
CTAAGTCGCT 
AGCATGAGTG 
AAGGATGAGG 
CTCTTCATGG 
TGGCCCTTCA 
ACCTTCAAAG 
AGGTGGAAAC 
TGGTCTGGAA 
ACAAAGAAGC 
GAGGTGCTCG 
ATTGAGAAAG 
TTTGCAATGC 
GAAGATTTGG 
CTGGGCCAGA 
ATTTCCCCGG 
CAGTGCCTGC 
TTGCTCAGGC 
GAAGGGGATG 
CTAAGTGGGG 
GCCTCTGCCA 
CTTGCCCTCC 
AATTCCATCT 



60 
120 
180 
240 
300 
360 
420 



51 

I 

AGCACCGCTC 
GGTGTGGTGA 
ACAAGTGACT 
TCAAAATGGA 
TGTGGACAAG 
CCCTGGCCAT 
CAGCCTTTGA 
CCTGCCTCCC 
CTGTGCTTGA 
TTCAAGTGCT 
ACAGGGCCAG 
GAAAAGTAGA 
TAGACCTGTT 
TGAAGCGAAA 
CCATGCAGGA 
AAGTGACTTG 
TGATTAATCT 
AGAAGGAAGA 
AG6CTCTCTA 
ACGTGATGAA 
TGATGCATCT 
TCATGCTGAC 
CCCTCCAGGA 
TGCCTTCCCT 
CCATATCTGC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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CTTGCAGAGT CTCCTGCAGC ACCTCATCGG GCTGAGCAAT CTGACCCACG TGCTGTATCC 1560 

TGTCCCCCTG GAGAGTTATG AGGACATCCA TGGTACCCTC CACCTGGAGA GGCTTGCCTA 1620 

TCTGCATGCC AGGCTCAGGG AGTTGCTGTG TGAGTTGGGG CGGCCCAGCA TGGTCTGGCT 1680 

TAGTGCCAAC CCCTGTCCTC ACTGTGGGGA CAGAACCTTC TATGACCCGG AGCCCATCCT 1740 

GTGCCCCTGT TTCATGCCTA ACTAGCTGGG TGCACATATC AAATGCTTCA TTCTGCATAC 1800 

TTGGACACTA AAGCCAGGAT GTGCATGCAT CTTGAAGCAA CAAAGCAGCC ACAGTTTCAG 1860 

ACAAATGTTC AGTGTGAGTG AGGAAAACAT GTTCAGTGAG GAAAAAACAT TCAGACAAAT 1920 

GTTCAGTGAG GAAAAAAAGG GGAAGTTGGG GATAGGCAGA TGTTGACTTG AGGAGTTAAT 1980 

GTGATCTTTG GGGAGATACA TCTTATAGAG TTAGAAATAG AATCTGAATT TCTAAAGGGA 2040 

GATTCTGGCT TGGGAAGTAC ATGTAGGAGT TAATCCCTGT GTAGACTGTT GTAAAGAAAC 2100 
TGTTGAAAAT AAAGAGAAGC AATGTGAAGC AAAAAAAAAA AAAAAAAA 

Seq ID NO: 30 Protein sequence: 
Protein Accession #: NP_006106.1 

1 11 21 31 41 51 

I I I I I I 

GCTTCAGGGT ACAGCTCCCC CGCAGCCAGA AGCCGGGCCT GCAGCGCCTC AGCACCGCTC 60 

CGGGACACCC CACCCGCTTC CCAGGCGTGA CCTGTCAACA GCAACTTCGC GGTGTGGTGA 120 

ACTCTCTGAG GAAAAACCAT TTTGATTATT ACTCTCAGAC GTGCGTGGCA ACAAGTGACT 180 

GAGACCTAGA AATCCAAGCG TTGGAGGTCC TGAGGCCAGC CTAAGTOGCT TCAAAATGGA 240 

ACGAAGGCGT TTGTGGGGTT CCATTCAGAG CCGATACATC AGCATGAGTG TGTGGACAAG 300 

CCCACGGAGA CTTGTGGAGC TGGCAGGGCA GAGCCTGCTG AAGGATGAGG CCCTGGCCAT 360 

TGCCGCCCTG GAGTTGCTGC CCAGGGAGCT CTTCCCGCCA CTCTTCATGG CAGCCTTTGA 420 

CGGGAGACAC AGCCAGACCC TGAAGGCAAT GGTGCAGGCC TGGCCCTTCA CCTGCCTCCC 480 

TCTGGGAGTG CTGATGAAGG GACAACATCT TCACCTGGAG ACCTTCAAAG CTGTGCTTGA 540 

TGGACTTGAT GTGCTCCTTG CCCAGGAGGT TCGCCCCAGG AGGTGGAAAC TTCAAGTGCT 600 

GGATTTACGG AAGAACTCTC ATCAGGACTT CTGGACTGTA TGGTCTGGAA ACAGGGCCAG 660 

TCTGTACTCA TTTCCAGAGC CAGAAGCAGC TCAGCCCATG ACAAAGAAGC GAAAAGTAGA 720 

TGGTTTGAGC ACAGAGGCAG AGCAGCCCTT CATTCCAGTA GAGGTGCTCG TAGACCTGTT 780 

CCTCAAGGAA GGTGCCTGTG ATGAATTGTT CTCCTACCTC ATTGAGAAAG TGAAGCGAAA 840 

GAAAAATGTA CTACGCCTGT GCTGTAAGAA GCTGAAGATT TTTGCAATGC CCATGCAGGA 900 

TATCAAGATG ATCCTGAAAA TGGTGCAGCT GGACTCTATT GAAGATTTGG AAGTGACTTG 960 

TACCTGGAAG CTACCCACCT TGGCGAAATT TTCTCCTTAC CTGGGCCAGA TGATTAATCT 1020 

GCGTAGACTC CTCCTCTCCC ACATCCATGC ATCTTCCTAC ATTTCCCCGG AGAAGGAAGA 1080 

GCAGTATATC GCCCAGTTCA CCTCTCAGTT CCTCAGTCTG CAGTGCCTGC AGGCTCTCTA 1140 

TGTGGACTCT TTATTTTTCC TTAGAGGCCG CCTGGATCAG TTGCTCAGGC ACGTGATGAA 1200 

CCCCTTGGAA ACCCTCTCAA TAACTAACTG CCGGCTTTCG GAAGGGGATG TGATGCATCT 1260 

GTCCCAGAGT CCCAGCGTCA GTCAGCTAAG TGTCCTGAGT CTAAGTGGGG TCATGCTGAC 1320 

CGATGTAAGT CCCGAGCCCC TCCAAGCTCT GCTGGAGAGA GCCTCTGCCA CCCTCCAGGA 1380 

CCTGGTCTTT GATGAGTGTG GGATCACGGA TGATCAGCTC CTTGCCCTCC TGCCTTCCCT 1440 

GAGCCACTGC TCCCAGCTTA CAACCTTAAG CTTCTAC5GG AATTCCATCT CCATATCTGC 1500 

CTTGCAGAGT CTCCTGCAGC ACCTCATCGG GCTGAGCAAT CTGACCCACG TGCTGTATCC 1560 

TGTCCCCCTG GAGAGTTATG AGGACATCCA TGGTACCCTC CACCTGGAGA GGCTTGCCTA 1620 

TCTGCATGCC AGGCTCAGGG AGTTGCTGTG TGAGTTGGGG CGGCCCAGCA TGGTCTGGCT 1680 

TAGTGCCAAC CCCTGTCCTC ACTGTGGGGA CAGAACCTTC TATGACCCGG AGCCCATCCT 1740 

GTGCCCCTGT TTCATGCCTA ACTAGCTGGG TGCACATATC AAATGCTTCA TTCTGCATAC 1800 

TTGGACACTA AAGCCAGGAT GTGCATGCAT CTTGAAGCAA CAAAGCAGCC ACAGTTTCAG 1860 

ACAAATGTTC AGTGTGAGTG AGGAAAACAT GTTCAGTGAG GAAAAAACAT TCAGACAAAT 1920 

GTTCAGTGAG GAAAAAAAGG GGAAGTTGGG GATAGGCAGA TGTTGACTTG AGGAGTTAAT 1980 

GTGATCTTTG GGGAGATACA TCTTATAGAG TTAGAAATAG AATCTGAATT TCTAAAGGGA 2040 

GATTCTGGCT TGGGAAGTAC ATGTAGGAGT TAATCCCTGT GTAGACTGTT GTAAAGAAAC 2100 
TGTTGAAAAT AAAGAGAAGC AATGTGAAGC AAAAAAAAAA AAAAAAAA 

Seq ID NO: 31 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 64-2754 

1-11 21 31 41 51 

GGCAGGTCTC GCTCTCGGCA CCCTCCCGGC GCCCGCGTTC TCCTGGCCCT GCCCGGCATC 60 

CCGATGGCCG CCGCTGGGCC CCGGCGCTCC GTGCGCGGAG CCGTCTGCCT GCATCTGCTG 120 

CTGACCCTCG TGATCTTCAG TCGTGATGGT GAAGCCTGCA AAAAGGTGAT ACTTAATGTA 180 

CCTTCTAAAC TAGAGGCAGA CAAAATAATT GGCAGAGTTA ATTTGGAAGA GTGCTTCAGG 240 

TCTGCAGACC TCATCCGGTC AAGTGATCCT GATTTCAGAG TTCTAAATGA TGGG TCAGT G 300 

TACACAGCCA GGGCTGTTGC GCTGTCTGAT AAGAAAAGAT CATTTACCAT ATGGCTTTCT 360 

GACAAAAGGA AACAGACACA GAAAGAGGTT ACTGTGCTGC TAGAACATCA GAAGAAGGTA 420 

TCGAAGACAA GACACACTAG AGAAACTGTT CTCAGGCGTG CCAAGAGGAG ATGGGCACCT 480 

ATTCCTTGCT CTATGCAAGA GAATTCCTTG GGCCCTTTCC CATTGTTTCT TCAACAAGTT 540 

GAATCTGATG CAGCACAGAA CTATACTGTC TTCTACTCAA TAAGTGGACG TGGAGTTGAT 600 

AAAGAACCTT TAAATTTGTT TTATATAGAA AGAGACACTG GAAATCTATT TTGCACTCGG 660 

CCTGTGGATC GTGAAGAATA TGATGTTTTT GATTTGATTG CTTATGCGTC AACTGCAGAT 720 

GGATATTCAG CAGATCTGCC CCTCCCACTA CCCATCAGGG TAGAGGATGA AAATGACAAC 780 

CACCCTGTTT TCACAGAAGC AATTTATAAT TTTGAAGTTT TGGAAAGTAG TAGACCTGGT 840 

ACTACAGTGG GGGTGGTTTG TGCCACAGAC AGAGATGAAC CGGACACAAT GCATACGCGC 900 

CTGAAATACA GCATTTTGCA GCAGACACCA AGGTCACCTG GGCTCTTTTC TGTGCATCCC 960 

AGCACAGGCG TAATCACCAC AGTCTCTCAT TATTTGGACA GAGAGGTTGT AGACAAGTAC 1020 

TCATTGATAA TGAAAGTACA AGACATGGAT GGCCAGTTTT TTGGATTGAT AGGCACATCA 1080 

ACTTGTATCA TAACAGTAAC AGATTCAAAT GATAATGCAC CCACTTTCAG ACAAAATGCT 1140 

TATGAAGCAT TTGTAGAGGA AAATGCATTC AATGTGGAAA TCTTACGAAT ACCTATAGAA 1200 

GATAAGGATT TAATTAACAC TGCCAATTGG AGAGTCAATT TTACCATTTT AAAGGGAAAT 1260 

GAAAATGGAC ATTTCAAAAT CAGCACAGAC AAAGAAACTA ATGAAGGTGT TCTTTCTGTT 1320 

GTAAAGCCAC TGAATTATGA AGAAAACCGT CAAGTGAACC TGGAAATTGG AGTAAACAAT 1380 

GAAGCGCCAT TTGCTAGAGA TATTCCCAGA GTGACAGCCT TGAACAGAGC CTTGGTTACA 1440 

GTTCATGTGA GGGATCTGGA TGAGGGGCCT GAATGCACTC CTGCAGCCCA ATATGTGCGG 1500 

ATTAAAGAAA ACTTAGCAGT GGGGTCAAAG ATCAACGGCT ATAAGGCATA TGACCCCGAA 1560 
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AATAGAAATG GCAATGGTTT AAGGTACAAA AAATTGCATG ATCCTAAAGG TTGGATCACC 1620 

ATTGATGAAA TTTCAGGGTC AATCATAACT TCCAAAATCC TGGATAGGGA GGTTGAAACT 1680 

CCCAAAAATG AGTTGTATAA TATTACAGTC CTGGCAATAG ACAAAGATGA TAGATCATGT 1740 

ACTGGAACAC TTGCTGTGAA CATTGAAGAT GTAAATGATA ATCCACCAGA AATACTTCAA 1800 

GAATATGTAG TCATTTGCAA ACCAAAAATG GGGTATACCG ACATTTTAGC TGTTGATCCT 1860 

GATGAACCTG TCCATGGAGC TCCATTTTAT TTCAGTTTGC CCAATACTTC TCCAGAAATC 1920 

AGTAGACTGT GGAGCCTCAC CAAAGTTAAT GATACAGCTG CCCGTCTTTC ATATCAGAAA 1980 

AATGCTGGAT TTCAAGAATA TACCATTCCT ATTACTGTAA AAGACAGGGC CGGCCAAGCT 2040 

GCAACAAAAT TATTGAGAGT TAATCTGTGT GAATGTACTC ATCCAACTCA GTGTCGTGCG 2100 

ACTTCAAGGA GTACAGGAGT AATACTTGGA AAATGGGCAA TCCTTGCAAT ATTACTGGGT 2160 

ATAGCACTGC TCTTTTCTGT ATTGCTAACT TTAGTATGTG GAGTTTTTGG TGCAACTAAA 2220 

GGGAAACGTT TTCCTGAAGA TTTAGCACAG CAAAACTTAA TTATATCAAA CACAGAAGCA 2280 

CCTGGAGACG ATAGAGTGTG CTCTGCCAAT GGATTTATGA CCCAAACTAC CAACAACTCT 2340 

AGCCAAGGTT TTTGTGGTAC TATGGGATCA GGAATGAAAA ATGGAGGGCA GGAAACCATT 2400 

GAAATGATGA AAGGAGGAAA CCAGACCTTG GAATCCTGCC GGGGGGCTGG GCATCATCAT 2460 

ACCCTGGACT CCTGCAGGGG AGGACACACG GAGGTGGACA ACTGCAGATA CACTTACTCG 2520 

GAGTGGCACA GTTTTACTCA ACCCCGTCTC GGTGAAAAAT TGCATCGATG TAATCAGAAT 2580 

GAAGACCGCA TGCCATCCCA AGATTATGTC CTCACTTATA ACTATGAGGG AAG AGGATC T 2640 

CCAGCTGGTT CTGTGGGCTG CTGCAGTGAA AAGCAGGAAG AAGATGGCCT TGACTTTTTA 2700 

AATAATTTGG AACCCAAATT TATTACATTA GCAGAAGCAT GCACAAAGAG ATAATGTCAC 2760 

AGTGCTACAA TTAGGTCTTT GTCAGACATT CTGGAGGTTT CCAAAAATAA TATTGTAAAG 2820 

TTCAATTTCA ACATGTATGT ATATGATGAT TTTTTTCTCA ATTTTGAATT ATGCTACTCA 2880 

CCAATTTATA TTTTTAAAGC CAGTTGTTGC TTATCTTTTC CAAAAAGTGA AAAATGTTAA 2940 

AACAGACAAC TGGTAAATCT CAAACTCCAG CACTGGAATT AAGGTCTCTA AAGCATCTGC 3000 

TCTTTTTTTT TTTTACGGAT ATTTTAGTAA TAAATATGCT GGATAAATAT TAGTCCAACA 3060 

ATAGCTAAGT TATGCTAATA TCACATTATT ATGTATTCAC TTTAAGTGAT AGTTTAAAAA 3120 

ATAAACAAGA AATATTGAGT ATCACTATGT GAAGAAAGTT TTGGAAAAGA AACAATGAAG 3180 

ACTGAATTAA ATTAAAAATG TTGCAGCTCA TAAAGAATTG GGACTCACCC CTACTGCACT 3240 

ACCAAATTCA TTTGACTTTG GAGGCAAAAT GTGTTGAAGT GCCCTATGAA GTAGCAATTT 3300 

TCTATAGGAA TATAGTTGGA AATAAATGTG TGTGTGTATA TTATTATTAA TCAATGCAAT 3360 

ATTTAAAATG AAATGAGAAC AAAGAGGAAA ATGGTAAAAA CTTGAAATGA GGCTGGGGTA 3420 

TAGTTTGTCC TACAATAGAA AAAAGAGAGA GCTTCCTAGG CCTGGGCTCT TAAATGCTGC 3430 

ATTATAACTG AGTCTATGAG GAAATAGTTC CTGTCCAATT TGTGTAATTT GTTTAAAATT 3540 

GTAAATAAAT TAAACTTTTC TGGTTTCTGT GGGAAGGAAA TAGGGAATCC AATGGAACAG 3600 

TAGCTTTGCT TTGCAGTCTG TTTCAAGATT TCTGCATCCA CAAGTTAGTA GCAAACTGGG 3660 

GAATACTCGC TGCAGCTGGG GTTCCCTGCT TTTTGGTAGC AAGGGTCCAG AGATGAGGTG 3720 

TTTTTTTCGG GGAGCTAATA ACAAAAACAT TTTAAAACTT ACCTTTACTG AAGTTAAATC 3780 

CTCTATTGCT GTTTCTATTC TCTCTTATAG TGACCAACAT CTTTTTAATT TAGATCCAAA 3840 

TAACCATGTC CTCCTAGAGT TTAGAGGCTA GAGGGAGCTG AGGGGAGGAT CTTACTGAAA 3900 

GCACCCTGGG GAGATTGATT GTCCTTAAAC CTAAGCCCCA CAAACTTGAC ACCTGATCAG 3960 

GTCTGGGAGC TACAAAATTT CATTTTTCTC CTCACTGCCC TTCTTCTGAG TGGCATTGGC 4020 

CTGAATCAAG GAAAGCCAGG CCTTGTGGGC CCCCTTCTTT CGGCTTTCTG CTAAAGCAAC 4080 

ACCTCCAGCA GAGATTCCCT TAAGTGACTC CAGGTTTTCC ACCATCCTTC AGCGTGAATT 4140 

AATTTTTAAT CAGTTTGCTT TCTCCAGAGA AATTTTAAAA TAATAGAAGA AATAGAAATT 4200 

TTGAATGTAT AAAAGAAAAA GATCAAGTTG TCATTTTAGA ACAGAGGGAA CTTTGGGAGA 4260 

AAGCAGCCCA AGTAGGTTAT TTGTACAGTC AGAGGGCAAC AGGAAGATGC AGGCCTTCAA- 4320 

GGGCAAGGAG AGGCCACAAG GAATATGGGT GGGAGTAAAA GCAACATCGT CTGCTTCATA 4380 

CTTTTTCCTA GGCTTGGCAC TGCCTTTTCC TTTCTCAGGC CAATGGCAAC TGCCATTTGA 4440 

GTCCGGTGAG GGATCAGCCA ACCTCTTCTC TATGGCTCAC CTTATTTGGA GTGAGAAATC 4500 

AAGGAGACAG AGCTGACTGC ATGATGAGTC TGAAGGCATT TGCAGGATGA GCCTGAACTG 4560 

GTTGTGCAGA ACAAACAAGG CATTCATGGG AATTGTTGTA TTCCTTCTGC AGCCCTCCTT 4620 

CTGGGCACTA AGAAGGTCTA TGAATTAAAT GCCTATCTAA AATTCTGATT TATTCCTACA 4680 

TTTTCTGTTT TCTAATTTGA CCCTAAAATC TATGTGTTTT AGACTTAGAC TTTTTATTGC 4740 

CCCCCCCCCC TTTTTTTTTG AGACGGAGTC TCGCTCTGAC GCACAGGCTG GAGTGCAGTG 4800 

GCTCCGATCT CTGCTCACTG AAAGCTCCGC CTCCCGGGTT CATGCCATTC TCCT GCCTCA 4860 

GCCTCCTGAG TAGCTGGGAC TACAGGCGCC CACCACCACG CCCGGCTAAT TTTTTGTATT 4920 

TTTAATAGAG ACGGGGTTTC ACTGTGTTAG CCAGGATGGT CTCGATCTCC TGACCTCGTG 4980 

ATCCGCCTGC CTCGGCCTCC CAAAGTGCTG GGATTACAGG CATGACCCAC CGCTCCCGGC 5040 

CTTGTTTTCC GTTTAAAGTC GTCTTCTTTT AATGTAATCA TTTTGAACAT GTGTGAAAGT 5100 

TGATCATACG AATTGGATCA ATCTTGAAAT ACTCAACCAA AAGACAGTCG AGAAGCCAGG 5160 

GGGAGAAAGA ACTCAGGGCA CAAAATATTG GTCTGAGAAT GGAATTCTCT GTAAGCCTAG 5220 

TTGCTGAAAT TTCCTGCTGT AACCAGAAGC CAGTTTTATC TAACGGCTAC TGAAACACCC 52 BO 

ACTGTGTTTT GCTCACTCCC TCACTCACCG ATCAAAACCT GCTACCTCCC CAAGACTTTA 5340 

CTAGTGCCGA TAAACTTTCT CAAAGAGCAA CCAGTATCAC TTCCCTGTTT ATAAAACCTC 5400 

TAACCATCTC TTTGTTCTTT GAACATGCTG AAAACCACCT GGTCTGCATG TATGCCCGAA 5460 

TTTGTAATTC TTTTCrCTCA AATGAAAATT TAATTTTAGG GATTCATTTC TATATTTTCA 5520 

CATATGTAGT ATTATTATTT CCTTATATGT GTAAGGTGAA ATTTATGGTA TTTGAGTGTG 5580 

CAAGAAAATA TATTTTTAAA GCTTTCATTT TTCCCCCAGT GAATGATTTA GAATTTTTTA 5640 

TGTAAATATA CAGAATGTTT TTTCTTACTT TTATAAGGAA GCAGCTGTCT AAAATGCAGT 5700 

GGGGTTTGTT TTGCAATGTT TTAAACAGAG TTTTAGTATT GCTATTAAAA GAAGTTACTT 5760 

TGCTTTTAAA GAAACTTGGC TGCTTAAAAT AAGCAAAAAT TGGATGCATA AAGTAATATT 5820 

TACAGATGTG GGGAGATGTA ATAAAACAAT ATTAACTTGG TTTCTTGTTT TTGCTGTATT 5880 

TAGAGATTAA ATAATTCTAA GATGATCACT TTGCAAAATT ATGCTTATGG CTGGCATGGA 5940 

AATAGAAATA CTCAATTATG TCTTTGTTGT ATTAATGGGG AATATTTTGG ACAATGTTTC 6000 

ATTATCAAAT TGTCGACATC ATTAATATAT ATTGTAATGT TGGGAAGAGA TCACTATTTT 6060 

GAAGCACAGC TTTACAGATG AGTATCTATG ATACATATGT ATAATAAATT TTGATCGGGT 6120 

ATTAAAAGTA TTAGAAGGTG GTTATAATTG CAGAGTATTC CATGAATAGT ACACTGACAC 6180 

AGGGGTTTTA CTTTGAGGAC CAGTGTAGTC AAGGGAAAAC ATGAGTTAAA AAGAAAAGCA 6240 

GGCAATATTG CAGTCTTGAT TCTGCCACTT ACAGGATAGA TAATGCCTGA ACTTTAATGA 6300 

CAAGATGATC CAACCATAAA GGTGCTCTGT GCTTCACAGT GAATCTTTTC CCCATGCAGG 6360 

AGTGTGCTCC CCTACAAACG TTAAGACTGA TCATTTCAAA AATCTATTAG CTATATCAAA 6420 

AGCCTTACAT TTTAATATAG GTTGAACCAA AATTTCAATT CCAGTAACTT CTATTGTAAC 6480 

CATTATTTTT GTGTATGTCT TCAAGAATGT TCATTGGATT TTTGTTTGTA ATAGTAAAAT 6540 

ACCGGATACA TTTCACGTGT CCTTCAGTAT TGATTTGGTT GAATATTGGG TCATAATGGT 6600 

TGAGAAGCAT GGACACTAGA GCCAGAATGC TTGGATATGA ATCCTGGATC TGTCACTTAC 6660 

TTCTGTGTGA CCTTTGAAAG GCTACTTATT TCCTCTCTTA GCTTTCTCAT TAAAATCAAT 6720 

GAACAATGCC AGCCTCATGG GGTTGTTGAA TGATTAAATT AGTTAATATA CCTAAAGTAC 6780 
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ATAGAACACT GCCTGCACAT AGTAAAAGAA TTATAAGTGT GAGGTAGTTG GTAAAATTAT 6840 

GTAGTTGGAT ATACTACCGA ACAATATCTA ATCTCTTTTT AGGGAAATAA AGTTTGTGCA 6900 
TATATATAAT CCCGAAACAT Q 

Seq ID NO: 32 Protein sequence: 
Protein Accession #: NP_001932.1 

1 11 21 31 41 51 

I I I I I I 

MAAAGPRRSV RGAVCLHLLL TLVIFSRDGE ACKKVILNVP SKLEADKIIG RVNLEECFRS 60 

ADLIRSSDPD FRVLNDGSVY TARAVALSDK KRSFTIWLSD KRKQTQKEVT VLLEHQKKVS 120 

KTRHTRETVL RRAKRRWAP I PCSMQENSLG PFPLFLQQVE SDAAQNYTVF YSISGRGVDK 180 

EPLNLFYIER DTGNLFCTRP VDREEYDVFD LIAYASTADG YSADLPLPI»P IRVEDENDNH 240 

PVFTEAIYNF EVLESSRPGT TVGWCATDR DEPDTMHTRL KYSILQQTPR SPGLFSVHPS 300 

TGVITTVSHY LDREWDKYS LIMKVQDMDG QFFGLIGTST CIITVTDSND NAPTFRQNAY 360 

EAFVEENAFN VEILRIPIED KDLINTANWR VNFTILKGNE NGHFKISTDK ETNEGVLSW 420 

KPLNYEENRQ VNLEIGVNNE APFARDIPRV TALNRALVTV HVRDLDEGPE CTPAAQYVRI 480 

KENLAVGSKI NGYKAYDPEN RNGNGLRYKK LHDPKGWITI DEISGSIITS KILDREVETP 540 

KNELYNITVL AIDKDDRSCT GTLAVNIEDV NDNPPEILQE YWICKPKMG YTDILAVDPD 600 

EPVHGAPFYF SLPNTSPEIS RLWSLTKVND TAARLSYQKN AGFQEYTIPI TVKDRAGQAA 660 

TKLLRVNLCE CTHPTQCRAT SRSTGVILGK WAILAILLGI ALLFSVLLTL VOGVFGATKG 720 

KRFPEDLAQQ NLIISNTEAP GDDRVCSANG FMTQTTNNSS QGFCGTMGSG MKNGGQETIE 780 

MMKGGNQTLE SCRGAGHHHT LDSCRGGHTE VDNCRYTYSE WHSFTQPRLG EKLHRCNQNE 840 
DRMPSQDYVL TYNYEGRGSP AGSVGCCSEK QEEDGLDFLN NLEPKFITLA EACTKR 

Seq ID NO: 33 DNA sequence 

Nucleic Acid Accession #: Eos sequence- 

Coding sequence: 64-2583 

1 11 21 31 41 51 

I I I I I I 

GGCAGGTCTC GCTCTCGGCA CCCTCCCGGC GCCCGCGTTC TCCTGGCCCT GCCCGGCATC 60 

CCGATGGCCG CCGCTGGGCC CCGGCGCTCC GTGCGCGGAG CCGTCTGCCT GCATCTGCTG 120 

CTGACCCTCG TGATCTTCAG TCGTGATGGT GAAGCCTGCA AAAAGGTGAT ACTTAATGTA 180 

CCTTCTAAAC TAGAGGCAGA CAAAATAATT GGCAGAGTTA ATTTGGAAGA GTGCTTCAGG 240 

TCTGCAGACC TCATCCGGTC AAGTGATCCT GATTTCAGAG TTCTAAATGA TGGGTCAGTG 300 

TACACAGCCA GGGCTGTTGC GCTGTCTGAT AAGAAAAGAT CATTTACCAT ATGGCTTTCT 360 

GACAAAAGGA AACAGACACA GAAAGAGGTT ACTGTGCTGC TAGAACATCA GAAGAAGGTA 420 

TCGAAGACAA GACACACTAG AGAAACTGTT CTCAGGCGTG CCAAGAGGAG ATGGGCACCT 480 

ATTCCTTGCT CTATGCAAGA GAATTCCTTG GGCCCTTTCC CATTGTTTCT TCAACAAGTT 540 

GAATCTGATG CAGCACAGAA CTATACTGTC TTCTACTCAA TAAGTGGACG TGGAGTTGAT 600 

AAAGAACCTT TAAATTTGTT TTATATAGAA AGAGACACTG GAAATCTATT TTGCACTCGG 660 

CCTGTGGATC GTGAAGAATA TGATGTTTTT GATTTGATTG CTTATGCGTC AACTGCAGAT 720 

GGATATTCAG CAGATCTGCC CCTCCCACTA CCCATCAGGG TAGAGGATGA AAATGACAAC 780 

CACCCTGTTT TCACAGAAGC AATTTATAAT TTTGAAGTTT TGGAAAGTAG TAGACCTGGT 840 

ACTACAGTGG GGGTGGTTTG TGCCACAGAC AGAGATGAAC CGGA CACA AT GCATACGCGC 900 

CTGAAATACA GCATTTTGCA GCAGACACCA AGGTCACCTG GGCTCTTTTC TGTGCATCCC 960 

AGCACAGGCG TAATCACCAC AGTCTCTCAT TATTTGGACA GAGAGGTTGT AGACAAGTAC 1020 

TCATTGATAA TGAAAGTACA AGACATGGAT GGCCAGTTTT TTGGATTGAT AGGCACATCA 1080 

ACTTGTATCA TAACAGTAAC AGATTCAAAT GATAATGCAC CCACTTTCAG ACAAAATGCT 1140 

TATGAAGCAT TTGTAGAGGA AAATGCATTC AATGTGGAAA TCTTACGAAT ACCTATAGAA 1200 

GATAAGGATT TAATTAACAC TGCCAATTGG AGAGTCAATT TTACCATTTT AAAGGGAAAT 1260 

GAAAATGGAC ATTTCAAAAT CAGCACAGAC AAAGAAACTA ATGAAGGTGT TCTTTCTGTT 1320 

GTAAAGCCAC TGAATTATGA AGAAAACCGT CAAGTGAAGC TGGAAATTGG AGTAAACAAT 1380 

GAAGCGCCAT TTGCTAGAGA TATTCCCAGA GTGACAGCCT TGAACAGAGC CTTGGTTACA 1440 

GTTCATGTGA GGGATCTGGA TGAGGGGCCT GAATGCACTC CTGCAGCCCA ATATGTGCGG 1500 

ATTAAAGAAA ACTTAGCAGT GGGGTCAAAG ATCAACGGCT ATAAGGCATA TGACCCCGAA 1560 

AATAGAAATG GCAATGGTTT AAGGTACAAA AAATTGCATG ATCCTAAAGG TTGGATCACC 1620 

ATTGATGAAA TTTCAGGGTC AATCATAACT TCCAAAATCC TGGATAGGGA GGTTGAAACT 1680 

CCCAAAAATG AGTTGTATAA TATTACAGTC CTGGCAATAG ACAAAGATGA TAGATCATGT 1740 

ACTGGAACAC TTGCTGTGAA CATTGAAGAT GTAAATGATA ATCCACCAGA AATACTTCAA 1800 

GAATATGTAG TCATTTGCAA ACCAAAAATG GGGTATACCG ACATTTTAGC TGTTGATCCT 1860 

GATGAACCTG TCCATGGAGC TCCATTTTAT TTCAGTTTGC CCAATACTTC TCCAGAAATC 1920 

AGTAGACTGT GGAGCCTCAC CAAAGTTAAT GATACAGCTG CCCGTCTTTC ATATCAGAAA 1980 

AATGCTGGAT TTCAAGAATA TACCATTCCT ATTACTGTAA AAGACAGGGC CGGCCAAGCT 2040 

GCAACAAAAT TATTGAGAGT TAATCTGTGT GAATGTACTC ATCCAACTCA GTGTCGTGCG 2100 

ACTTCAAGGA GTACAGGAGT AATACTTGGA AAATGGGCAA TCCTTGCAAT ATTACTGGGT 2160- 

ATAGCACTGC TCTTTTCTGT ATTGCTAACT TTAGTATGTG GAGTTTTTGG TGCAACTAAA 2220 

GGGAAACGTT TTCCTGAAGA TTTAGCACAG CAAAACTTAA TTATATCAAA CACAGAAGCA 2280 

CCTGGAGACG ATAGAGTGTG CTCTGCCAAT GGATTTATGA CCCAAACTAC CAACAACTCT 2340 

AGCCAAGGTT TTTGTGGTAC TATGGGATCA GGAATGAAAA ATGGAGGGCA GGAAACCATT 2400 

GAAATGATGA AAGGAGGAAA CCAGACCTTG GAATCCTGCC GGGGGGCTGG GCATCATCAT 2460 

ACCCTGGACT CCTGCAGGGG AGGACACACG GAGGTGGACA ACTGCAGATA CACTTACTCG 2520 

GAGTGGCACA GTTTTACTCA ACCCCGTCTC GGTGAAGAAT CCATTAGAGG ACACACTGGT 2580 

TAAAAATTAA ACATAAAAGA AATTGCATCG ATGTAATCAG AATGAAGACC GCATGCCATC 2640 

CCAAGATTAT GTCCTCACTT ATAACTATGA GGGAAGAGGA TCTCCAGCTG GTTCTGTGGG 2700 

CTGCTGCAGT GAAAAGCAGG AAGAAGATGG CCTTGACTTT TTAAATAATT TGGAACCCAA 2760 

ATTTATTACA TTAGCAGAAG CATGCACAAA GAGATAATGT CACAGTGCTA CAATTAGGTC 2820 

TTTGTCAGAC ATTCTGGAGG TTTCCAAAAA TAATATTGTA AAGTTCAATT TCAACATGTA 2880 

TGTATATGAT GATTTTTTTC TCAATTTTGA ATTATGCTAC TCACCAATTT ATATTTTTAA 2940 

AGCCAGTTGT TGCTTATCTT TTCCAAAAAG TGAAAAATGT TAAAACAGAC AACTGGTAAA 3000 

TCTCAAACTC CAGCACTGGA ATTAAGGTCT CTAAAGCATC TGCTCTTTTT TTTTTTTACG 3060 

GATATTTTAG TAATAAATAT GCTGGATAAA TATTAGTCCA ACAATAGCTA AGTTATGCTA 3120 

ATATCACATT ATTATGTATT CACTTTAAGT GATAGTTTAA AAAATAAACA AGAAATATTG 3180 

AGTATCACTA TGTGAAGAAA GTTTTGGAAA AGAAACAATG AAGACTGAAT TAAATTAAAA 3240 

ATGTTGCAGC TCATAAAGAA TTGGGACTCA CCCCTACTGC ACTACCAAAT TCATTTGACT 3300 
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TTGGAGGCAA AATGTGTTGA AGTGCCCTAT GAAGTAGCAA TTTTCTATAG GAATATAGTT 3360 

GGAAATAAAT GTGTGTGTGT ATATTATTAT TAATCAATGC AATATTTAAA ATGAAATGAG 3420 

AACAAAGAGG AAAATGGTAA AAACTTGAAA TGAGGCTGGG GTATAGTTTQ TCCTACAATA 3480 

GAAAAAAGAG AGAGCTTCCT AGGCCTGGGC TCTTAAATGC TGCATTATAA CTGAGTCTAT 3540 

GAGGAAATAG TTCCTGTCCA ATTTGTGTAA TTTGTTTAAA ATTGTA AATA AATTAAACTT 3600 

TTCTGGTTTC TGTGGGAAGG AAATAGGGAA TCCAATGGAA CAGTAGCTTT GCTTTGCAGT 3660 

CTGTTTCAAG ATTTCTGCAT CCACAAGTTA GTAGCAAACT GGGGAATACT GGCTGCAGCT 3720 

GGGGTTCCCT GCTTTTTGGT AGCAAGGGTC CAGAGATGAG GTGTTTTTTT CGGGGAGCTA 3780 

ATAACAAAAA CATTTTAAAA CTTACCTTTA CTGAAGTTAA ATCCTCTATT GCTGTTTCTA 3840 

TTCTCTCTTA TAGTGACCAA CATCTTTTTA ATTTAGATCC AAATAACCAT GTCCTCCTAG 3900 

AGTTTAGAGG CTAGAGGGAG CTGAGGGGAG GATCTTACTG AAAGCACCCT GGGGAGATTG 3960 

ATTGTCCTTA AACCTAAGCC CCACAAACTT GACACCTGAT CAGGTCTGGG AGCTACAAAA 4020 

TTTCATTTTT CTCCTCACTG CCCTTCTTCT GAGTGGCATT GGCCTGAATC AAGGAAAGCC 4080 

AGGCCTTGTG GGCCCCCTTC TTTCGGCTTT CTGCTAAAGC AACACCTCCA GCAGAGATTC 4140 

CCTTAAGTGA CTCCAGGTTT TCCACCATCC TTCAGCGTGA ATTAATTTTT AATCAGTTTG 4200 

CTTTCTCCAG AGAAATTTTA AAATAATAGA AGAAATAGAA ATTTTGAATG TATAAAAGAA 4260 

AAAGATCAAG TTGTCATTTT AGAACAGAGG GAACTTTGGG AGAAAGCAGC CCAAGTAGGT 4320 

TATTTGTACA GTCAGAGGGC AACAGGAAGA TGCAGGCCTT CAAGGGCAAG GAGAGGCCAC 4380 

AAGGAATATG GGTGGGAGTA AAAGCAACAT CGTCTGCTTC ATACTTTTTC CTAGGCTTGG 4440 

CACTGCCTTT TCCTTTCTCA GGCCAATGGC AACTGCCATT TGAGTCCGGT GAGGGATCAG 4500 

CCAACCTCTT CTCTATGGCT CACCTTATTT GGAGTGAGAA ATCAAGGAGA CAGAGCTGAC 4560 

TGCATGATGA GTCTGAAGGC ATTTGCAGGA TGAGCCTGAA CTGGTTGTGC AGAACAAACA 4620 

AGGCATTCAT GGGAATTGTT GTATTCCTTC TGCAGCCCTC CTTCTGGGCA CTAAGAAGGT 4680 

CTATGAATTA AATGCCTATC TAAAATTCTG ATTTATTCCT ACATTTTCTG TT TTCTAATT 4740 

TGACCCTAAA ATCTATGTGT TTTAGACTTA GACTTTTTAT TGCCCCCCCC CCCTTTTTTT 4800 

TTGAGACGGA GTCTCGCTCT GACGCACAGG CTGGAGTGCA GTGGCTCCGA TCTCTGCTCA 4860 

CTGAAAGCTC CGCCTCCCGG GTTCATGCCA TTCTCCTGCC TCAGCCTCCT GAGTAGCTGG 4920 

GACTACAGGC GCCCACCACC ACGCCCGGCT AATTTTTTGT ATTTTTAATA GAGACGGGGT 4980 

TTCACTGTGT TAGCCAGGAT GGTCTCGATC TCCTGACCTC GTGATCCGCC TGCCTCGGCC 5040 

TCCCAAAGTG CTGGGATTAC AGGCATGACC CACCGCTCCC GGCCTTGTTT TCCGTTTAAA 5100 

GTCGTCTTCT TTTAATGTAA TCATTTTGAA CATGTGTGAA AGTTGATCAT ACGAATTGGA 5160 

TCAATCTTGA AATACTCAAC CAAAAGACAG TCGAGAAGCC AGGGGGAGAA AGAACTCAGG 5220 

GCACAAAATA TTGGTCTGAG AATGGAATTC TCTGTAAGCC TAGTTGCTGA AATTTCCTGC 5280 

TGTAACCAGA AGCCAGTTTT ATCTAACGGC TACTGAAACA CCCACTGTGT TTTGCTCACT 5340 

CCCACTCACC GATCAAAACC TGCTACCTCC CCAAGACTTT ACTAGTGCCG ATAAACTTTC 5400 

TCAAAGAGCA ACCAGTATCA CTTCCCTGTT TATAAAACCT CTAACCATCT CTTTGTTCTT 5460 

TGAACATGCT GAAAACCACC TGGTCTGCAT GTATGCCCGA ATTTGTAATT CTTTTCTCTC 5S20 

AAATGAAAAT TTAATTTTAG GGATTCATTT CTATATTTTC ACATATGTAG TATTATTATT 5580 

TCCTTATATG TGTAAGGTGA AATTTATGGT ATTTGAGTGT GCAAGAAAAT ATATTTTTAA 5640 

AGCTTTCATT TTTCCCCCAG TGAATGATTT AGAATTTTTT ATGTAAATAT ACAGAATGTT 5700 

TTTTCTTACT TTTATAAGGA AGCAGCTGTC TAAAATGCAG TGGGGTTTGT TTTGCAATGT 5760 

TTTAAACAGA GTTTTAGTAT TGCTATTAAA AGAAGTTACT TTGCTTTTAA AGAAACTTGG 5820 

CTGCTTAAAA TAAGCAAAAA TTGGATGCAT AAAGTAATAT TTACAGATGT GGGGAGATGT 5880 

AATAAAACAA TATTAACTTG GCTGCTTAAA ATAAGCAAAA ATTGGATGCA TAAAGTAATA 5940 

TTTACAGATG TGGGGAGATG TAATAAAACA ATATTAACTT GGTTTCTTGT TTTTGCTGTA 6000 

TTTAGAGATT AAATAATTCT AAGATGATCA CTTTGCAAAA TTATGCTTAT GGCTGGCATG 6060 

GAAATAGAAA TACTCAATTA TGTCTTTGTT GTATTAATGG GGAATATTTT GGACAATGTT 6120 

TCATTATCAA ATTGTCGACA TCATTAATAT ATATTGTAAT GTTGGGAAGA GATCACTATT 6180 

TTGAAGCACA GCTTTACAGA TGAGTATCTA TGATACATAT GTATAATAAA TTTTGATCGG 6240 

GTATTAAAAG TATTAGAAGG TGGTTATAAT TGCAGAGTAT TCCATGAATA GTACACTGAC 6300 

ACAGGGGTTT TACTTTGAGG ACCAGTGTAG TCAAGGGAAA ACATGAGTTA AAAAGAAAAG 6360 

CAGGCAATAT TGCAGTCTTG ATTCTGCCAC TTACAGGATA GATAATGCCT GAACTTTAAT 6420 

GACAAGATGA TCCAACCATA AAGGTGCTCT GTGCTTCACA GTGAATCTTT TCCCCATGCA 6480 

GGAGTGTGCT CCCCTACAAA CGTTAAGACT GATCATTTCA AAAATCTATT AGCTATATCA 6540 

AAAGCCTTAC ATTTTAATAT AGGTTGAACC AAAATTTCAA TTCCAGTAAC TTCTATTGTA 6600 

ACCATTATTT TTGTGTATGT CTTCAAGAAT GTTCATTGGA TTTTTGTTTG TAATAGTAAA 6660 

ATACCGGATA CATTTCACGT GTCCTTCAGT ATTGATTTGG TTGAATATTG GGTCATAATG 6720 

GTTGAGAAGC ATGGACACTA GAGCCAGAAT GCTTGGATAT GAATCCTGGA TCTGTCACTT 6780 

ACTTCTGTGT GACCTTTGAA AGGCTACTTA TTTCCTCTCT TAGCTTTCTC ATTAAAATCA 6840 

ATGAACAATG CCAGCCTCAT GGGGTTGTTG AATGATTAAA TTAGTTAATA TACCTAAAGT 6900 

ACATAGAACA CTGCCTGCAC ATAGTAAAAG AATTATAAGT GTGAGGTAGT TGGTAAAATT 6960 

ATGTAGTTGG ATATACTACC GAACAATATC TAATCTCTTT TTAGGGAAAT AAAGTTTGTG 7020 
CATATATATA ATCCCGAAAC ATG 



Seq ID NO: 34 Protein sequence: 
Protein Accession #: NP_077741.1 

1 11 21 31 41 51 
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MAAAGPRRSV RGAVCLHLLL TLVIPSRDGE ACKKVILNVP SKLEADKIIG RVNLEECFRS 60 

ADLIRSSDPD FRVLNDGSVY TARAVALSDK KRSFTIWLSD KRKQTQKEVT VLLEHQKKVS 120 

KTRHTRETVL RRAKRRWAPI PCSMQENSLG PFPLFLQQVE SDAAQNYTVF YSISGRGVDK 180 

EPLNLFYIER DTGNLFCTRP VDREEYDVFD LIAYASTADG YSADLPLPLP IRVEDENDNH 240 

PVFTEAIYNF EVLESSRPGT TVGWCATDR DEPDTMHTRL KYSILQQTPR SPGLFSVHPS 300 

TGVITTVSHY LDREWDKYS LIMKVQDMDG QFFGLIGTST CIITVTDSND NAPTFRQNAY 360 

EAFVEENAFN VEILRIPIED KDLINTANWR VNFTILKGNE NGHFKISTDK ETNEGVLSW 420 

KPLNYEENRQ VNLEIGVNNE APFARDIPRV TALNRALVTV HVRDLDEGPE CTPAAQYVRI 480 

KENLAVGSKI NGYKAYDPEN RNGNGLRYKK LHDPKGWITI DEISGSIITS KILDREVETP 540 

KNELYNITVL AIDKDDRSCT GTLAVNIEDV NDNPPEILQE YWICKPKMG YTDILAVDPD 600 

EPVHGAPPYF SLPNTSPEIS RLWSLTKVND TAARLSYQKN AGFQEYTIPI TVKDRAGQAA 660 

TKLLRVNLCE CTHPTQCRAT SRSTGVILGK WAILAILLGI ALLFSVLLTL VCGVFGATKG 720 

KRFPEDLAQQ NLIISNTEAP GDDRVCSANG FMTQTTNNSS QGFCGTMGSG MKKGGQETIE 780 
MMKGGNQTLB SCRGAGHHHT L05CRGGHTS VDNCRYTYSE WHSFTQPRLG EESIRGHTG 
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GGGAGTGGGC GTGGCGGTGC TGCCCAGGTG AGCCACCGCT GCTTCTGCCC AGACACGGTC 60 

5 GCCTCCACAT CCAGGTCTTT GTGCTCCTCG CTTGCCTGTT CCTTTTCCAC GCATTTTCCA 120 

GGATAACTGT GACTCCAGGC CCGCAATGGA TGCCCTGCAA CTAGCAAATT CGGCTTTTGC 180 

CGTTGATCTG TTCAAACAAC TATGTGAAAA GGAGCCACTG GGCAATGTCC TCTTCTCTCC 240 

AATCTGTCTC TCCACCTCTC TGTCACTTGC TCAAGTGGGT GCTAAAGGTG ACACTGCAAA 300 

TGAAATTGGA CAGGTTCTTC ATTTTGAAAA TGTCAAAGAT ATACCCTTTG GATTTCAAAC 360 

10 AGTAACATCG GATGTAAACA AACTTAGTTC CTTTTACTCA CTGAAACTAA TCAAGCGGCT 420 

CTACGTAGAC AAATCTCTGA ATCTTTCTAC AGAGTTCATC AGCTCTACGA AGAGACCCTA 4B0 

TGCAAAGGAA TTGGAAACTG TTGACTTCAA AGATAAATTG GAAGAAACGA AAGGTCAGAT 540 

CAACAACTCA ATTAAGGATC TCACAGATGG CCACTTTGAG AACATTTTAG CTGACAACAG 600 

TGTGAACGAC CAGACCAAAA TCCTTGTGGT TAATGCTGCC TACTTTGTTG GCAAGTGGAT 660 

15 GAAGAAATTT CCTGAATCAG AAACAAAAGA ATGTCCTTTC AGACTCAACA AGACAGACAC 720 

CAAACCAGTG CAGATGATGA ACATGGAGGC CACGTTCTGT ATGGGAAACA TTGACAGTAT 780 

CAATTGTAAG ATCATAGAGC TTCCTTTTCA AAATAAGCAT CTCAGCATGT TCATCCTACT 840 

ACCCAAGGAT GTGGAGGATG AGTCCACAGG CTTGGAGAAG ATTGAAAAAC AACTCAACTC 900 

AGAGTCACTG TCACAGTGGA CTAATCCCAG CACCATGGCC AATGCCAAGG TCAAACTCTC 960 

20 CATTCCAAAA TTTAAGGTGG AAAAGATGAT TGATCCCAAG GCTTGTCTGG AAAATCTAGG 1020 

GCTGAAACAT ATCTTCAGTG AAGACACATC TGATTTCTCT GGAATGTCAG AGACCAAGGG 1080 

AGTGGCCCTA TCAAATGTTA TCCACAAAGT GTGCTTAGAA ATAACTGAAG ATGGTGGGGA 1140 

TTCCATAGAG GTGCCAGGAG CACGGATCCT GCAGCACAAG GATGAATTGA ATGCTGACCA 1200 

TCCCTTTATT TACATCATCA GGCACAACAA AACTCGAAAC ATCATTTTCT TTGGCAAATT 1260 

25 CTGTTCTCCT TAAGTGGCAT AGCCCATGTT AAGTCCTCCC TGACTTTTCT GTGGATGCCG 1320 

ATTTCTGTAA ACTCTGCATC CAGAGATTCA TTTTCTAGAT ACAATAAATT GCTAATGTTG 1380 

CTGGATCAGG AAGCCGCCAG TACTTGTCAT ATGTAGCCTT CACACAGATA GACCTTTTTT 1440 

TTTTTCCAAT TCTATCTTTT GTTTCCTTTT TTCCCATAAG ACAATGACAT ACGCTTTTAA 1500 

TGAAAAGGAA TCACGTTAGA GGAAAAATAT TTATTCATTA TTTGTCAAAT TGTCCGGGGT 1560 

30 AGTTGGCAGA AATACAGTCT TCCACAAAGA AAATTCCTAT AAGGAAGATT TGGAAGCTCT 1620 

TCTTCCCAGC ACTATGCTTT CCTTCTTTGG GATAGAGAAT GTTCCAGACA TTCTCGCTTC 1680 

CCTGAAAGAC TGAAGAAAGT GTAGTGCATG GGACCCACGA AACTGCCCTG GCTCCAGTGA 1740 

AACTTGGGCA CATGCTCAGG CTACTATAGG TCCAGAAGTC CTTATGTTAA GCCCTGGCAG 1800 

GCAGGTGTTT ATTAAAATTC TGAATTTTGG GGATTTTCAA AAGATAATAT TTTACATACA 1860 

35 CTGTATGTTA TAGAACTTCA TGGATCAGAT CTGGGGCAGC AACCTATAAA TCAACACCTT 1920 

AATATGCTGC AACAAAATGT AGAATATTCA GACAAAATGG ATACATAAAG ACTAAGTAGC 1980 

CCATAAGGGG TCAAAATTTG CTGCCAAATG CGTATGCCAC CAACTTACAA AAACACTTCG 2040 

TTCGCAGAGC TTTTCAGATT GTGGAATGTT GGATAAGGAA TTATAGACCT CTAGTAGCTG 2100 

AAATGCAAGA CCCCAAGAGG AAGTTCAGAT CTTAATATAA ATTCACTTTC ATTTTTGATA 2160 

40 GCTGTCCCAT CTGGTCATGT GGTTGGCACT AGACTGGTGG CAGGGGCTTC TAGCTGACTC 2220 

GCACAGGGAT TCTCACAATA GCCGATATCA GAATTTGTGT TGAAGGAACT TGTCTCTTCA 2280 

TCTAATATGA TAGCGGGAAA AGGAGAGGAA ACTACTGCCT TTAGAAAATA TAAGTAAAGT 2340 

GATTAAAGTG CTCACGTTAC CTTGACACAT AGTTTTTCAG TCTATGGGTT TAGTTACTTT 2400 

AGATGGCAAG CATGTAACTT ATATTAATAG TAATTTGTAA AGTTGGGTGG ATAAGCTATC 2460 

45 CCTGTTGCCG GTTCATGGAT TACTTCTCTA TAAAAAATAT ATATTTACCA AAAAATTTTG 2520 

TGACATTCCT TCTCCCATCT CTTCCTTGAC ATGCATTGTA AATAGGTTCT TCTTGTTCTG 2580 
AGATTCAATA TTGAATTTCT CCTATGCTAT TGACAATAAA ATATTATTGA ACTACC 

Seq ID NO: 36 Protein sequence t 
50 Protein Accession #: NP_002630.1 
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MDALQLANSA FAVDLFKQLC EKEPLGNVLF SPICLSTSLS LAQVGAKGDT ANEIGQVLHF 60 

55 ENVKDIPFGF QTVTSDVNKL SSFYSLKLIK RLYVDKSLNL STEFISSTKR PYAKELETVD 120 

FKDKLEETKG QINNSIKDLT DGHFENILAD NSVNDQTKIL WNAAYFVGK WMKKFPESET 180 

KECPFRLNKT DTKPVQMMNM EATFCMGNID SINCKIIELP FQNKHLSMFI LLPKDVEDES 240 

TGLEKIEKQL NSESLSQWTN PSTMANAKVK LSIPKFKVEK MIDPKACLEN LGLKHIFSED 300 

TSDFSGMSET KGVALSNVIH KVCLEITEDG GDSIEVPGAR ILQHKDELNA DHPFIYIIRH 360 
60 NKTRNIIFFG KFCSP 



. Seq ID NO: 37 DNA sequence 
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GGAGTGGGGG AGAGAGAGGA GACCAGGACA GCTGCTGAGA CCTCTAAGAA GTCCAGATAC 60 

70 TAAGAGCAAA GATGTTTCAA ACTGGGGGCC TCATTGTCTT CTACGGGCTG TTAGCCCAGA 120 

CCATGGCCCA GTTTGGAGGC CTGCCCGTGC CCCTGGACCA GACCCTGCCC TTGAATGTGA 180 

ATCCAGCCCT GCCCTTGAGT CCCACAGGTC TTGCAGGAAG CTTGACAAAT GCCCTCAGCA 240 

ATGGCCTGCT GTCTGGGGGC CTGTTGGGCA TTCTGGAAAA CCTTCCGCTC CTGGACATCC 300 

TGAAGCCTGG AGGAGGTACT TCTGGTGGCC TCCTTGGGGG ACTGCTTGGA AAAGTGACGT 360 

75 CAGTGATTCC TGGCCTGAAC AACATCATTG ACATAAAGGT CACTGACCCC CAGCTGCTGG 420 

AACTTGGCCT TGTGCAGAGC CCTGATGGCC ACCGTCTCTA TGTCACCATC CCTCTCGGCA 480 

TAAAGCTCCA AGTGAATACG CCCCTGGTCG GTGCAAGTCT GTTGAGGCTG GCTGTGAAGC 540 

TGGACATCAC TGCAGAAATC TTAGCTGTGA GAGATAAGCA GGAGAGGATC CACCTGGTCC 600 

TTGGTGACTG CACCCATTCC CCTGGAAGCC TGCAAATTTC TCTGCTTGAT GGACTTGGCC 660 

80 CCCTCCCCAT TCAAGGTCTT CTGGACAGCC TCACAGGGAT CTTGAATAAA GTCCTGCCTG 720 

AGTrGGTTCA GGGCAACGTG TGCCCTCTGG TCAATGAGGT TCTCAGAGGC TTGGACATCA 780 

CCCTGGTGCA TGACATTGTT AACATGCTGA TCCACGGACT ACAGTTTGTC ATCAAGGTCT 840 

AAGCCTTCCA GGAAGGGGCT GGCCTCTGCT GAGCTGCTTC CCAGTGCTCA CAGATGGCTG 900 

GCCCATGTGC TGGAAGATGA CACAGTTGCC TTCTCTCCGA GGAACCTGCC CCCTCTCCTT 960 

85 TCCCACCAGG CGTGTGTAAC ATCCCATGTG CCTCACCTAA TAAAATGGCT CTTCTTCTGC 1020 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAA 
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Seq ID NO: 38 Protein sequence : 
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MFQTGGLXVF YGLLAQTMAQ FGGLPVPLDQ TLPLNVNPAL PLSPTGLAGS LTHALSNGLL 60 

SGGLLGILEN LPLLDILKPG GGTSGGLLGG LLGKVTSVIP GLNNIIDIKV TDPQLLELGL 120 

VQSPDGHRLY VTIPLGIKLQ VNTPLVGASL LRLAVKLDIT AEILAVRDKQ ERIHLVLGDC 180 

THSPGSLQIS LLDGLGPLPI QGLLDSLTGI LNKVLPELVQ GNVCPLVNEV LRGLDITLVH 240 
DIVNMLIHGL QFVIKV 



Seq ID NO: 39 DNA sequence 

Nucleic Acid Accession §i NM_004363.1 

Coding sequence: 115-2223 
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CTCAGGGCAG AGGGAGGAAG GACAGCAGAC CAGACAGTCA CAGCAGCCTT GACAAAACGT 60 

TCCTGGAACT CAAGCTCTTC TCCACAGAGG AGGACAGAGC AGACAGCAGA GACCATGGAG 120 

TCTCCCTCGG CCCCTCCCCA CAGATGGTGC ATCCCCTGGC AGAGGCTCCT GCTCACAGCC 180 

TCACTTCTAA CCTTCTGGAA CCCGCCCACC ACTGCCAAGC TCACTATTGA ATCCACGCCG 240 

TTCAATGTCG CAGAGGGGAA GGAGGTGCTT CTACTTGTCC ACAATCTGCC CCAGCATCTT 300 

TTTGGCTACA GCTGGTACAA AGGTGAAAGA GTGGATGGCA ACCGTCAAAT TATAGGATAT 360 

GTAATAGGAA CTCAACAAGC TACCCCAGGG CCCGCATACA GTGGTCGAGA GATAATATAC 420 

CCCAATGCAT CCCTGCTGAT CCAGAACATC ATCCAGAATG ACACAGGATT CTACACCCTA 480 

CACGTCATAA AGTCAGATCT TGTGAATGAA GAAGCAACTG GCCAGTTCCG GGTATACCCG S40 

GAGCTGCCCA AGCCCTCCAT CTCCAGCAAC AACTCCAAAC CCGTGGAGGA CAAGGATGCT 600 

GTGGCCTTCA CCTGTGAACC TGAGACTCAG GACGCAACCT ACCTGTGGTG GGTAAACAAT 660 

CAGAGCCTCC CGGTCAGTCC CAGGCTGCAG CTGTCCAATG GCAACAGGAC CCTCACTCTA 720 

TTCAATGTCA CAAGAAATGA CACAGCAAGC TACAAATGTG AAACCCAGAA CCCAGTGAGT 780 

GCCAGGCGCA GTGATTCAGT CATCCTGAAT GTCCTCTATG GCCCGGATGC CCCCACCATT 840 

TCCCCTCTAA ACACATCTTA CAGATCAGGG GAAAATCTGA ACCTCTCCTG CCACGCAGCC 900 

TCTAACCCAC CTGCACAGTA CTCTTGGTTT GTCAATGGGA CTTTCCAGCA ATCCACCCAA 960 

GAGCTCTTTA TCCCCAACAT CACTGTGAAT AATAGTGGAT CCTATACGTG CCAAGCCCAT 1020 

AACTCAGACA CTGGCCTCAA TAGGACCACA GTCAOGACGA TCACAGTCTA TGCAGAGCCA 1080 

CCCAAACCCT TCATCACCAG CAACAACTCC AACCCCGTGG AGGATGAGGA TGCTGTAGCC 1140 

TTAACCTGTG AACCTGAGAT TCAGAACACA ACCTACCTGT GGTGGGTAAA TAATCAGAGC 1200 

CTCCCGGTCA GTCCCAGGCT GCAGCTGTCC AATGACAACA GGACCCTCAC TCTACTCAGT 1260 

GTCACAAGGA ATGATGTAGG ACCCTATGAG TGTGGAATCC AGAACGAATT AAGTGTTGAC 1320 

CACAGCGACC CAGTCATCCT GAATGTCCTC TATGGCCCAG ACGACCCCAC CATTTCCCCC 1380 

TCATACACCT ATTACCGTCC AGGGGTGAAC CTCAGCCTCT CCTGCCATGC AGCCTCTAAC 1440 

CCACCTGCAC AGTATTCTTG GCTGATTGAT GGGAACATCC AGCAACACAC ACAAGAGCTC 1500 

TTTATCTCCA ACATCACTGA GAAGAACAGC GGACTCTATA CCTGCCAGGC CAATAACTCA 1560 

GCCAGTGGCC ACAGCAGGAC TACAGTCAAG ACAATCACAG TCTCTGCGGA GCTGCCCAAG 1620 

CCCTCCATCT CCAGCAACAA CTCCAAACCC GTGGAGGACA AGGATGCTGT GGCCTTCACC 1680 

TGTGAACCTG AGGCTCAGAA CACAACCTAC CTGTGGTGGG TAAATGGTCA GAGCCTCCCA 1740 

GTCAGTCCCA GGCTGCAGCT GTCCAATGGC AACAGGACCC TCACTCTATT CAATGTCACA 1800 

AGAAATGACG CAAGAGCCTA TGTATGTGGA ATCCAGAACT CAGTGAGTGC AAACCGCAGT 1860 

GACCCAGTCA CCCTGGATGT CCTCTATGGG CCGGACACCC CCATCATTTC CCCCCCAGAC 1920 

TCGTCTTACC TTTCGGGAGC GAACCTCAAC CTCTCCTGCC ACTCGGCCTC TAACCCATCC 1980 

CCGCAGTATT CTTGGCGTAT CAATGGGATA CCGCAGCAAC ACACACAAGT TCTCTTTATC 2040 

GCCAAAATCA CGCCAAATAA TAACGGGACC TATGCCTGTT TTGTCTCTAA CTTGGCTACT 2100 

GGCCGCAATA ATTCCATAGT CAAGAGCATC ACAGTCTCTG CATCTGGAAC TTCTCCTGGT 2160 

CTCTCAGCTG GGGCCACTGT CGGCATCATG ATTGGAGTGC TGGTTGGGGT TGCTCTGATA 2220 

TAGCAGCCCT GGTGTAGTTT CTTCATTTCA GGAAGACTGA CAGTTGTTTT GCTTCTTCCT 2280 

TAAAGCATTT GCAACAGCTA CAGTCTAAAA TTGCTTCTTT ACCAAGGATA TTTACAGAAA 2340 

AGACTCTGAC CAGAGATCGA GACCATCCTA GCCAACATCG TGAAACCCCA TCTCTACTAA 2400 

AAATACAAAA ATGAGCTGGG CTTGGTGGCG CGCACCTGTA GTCCCAGTTA CTCGGGAGGC 2460 

TGAGGCAGGA GAATCGCTTG AACCCGGGAG GTGGAGATTG CAGTGAGCCC AGATCGCACC 2520 

ACTGCACTCC AGTCTGGCAA CAGAGCAAGA CTCCATCTCA AAAAGAAAAG AAAAGAAGAC 2580 

TCTGACCTGT ACTCTTGAAT ACAAGTTTCT GATACCACTG CACTGTCTGA GAATTTCCAA 2640 

AACTTTAATG AACTAACTGA CAGCTTCATG AAACTGTCCA CCAAGATCAA GCAGAGAAAA 2700 

TAATTAATTT CATGGGACTA AATGAACTAA TGAGGATTGC TGATTCTTTA AATGTCTTGT 2760 

TTCCCAGATT TCAGGAAACT TTT TT TCTT T TAAGCTATCC ACTCTTACAG CAATTTGATA 2820 

AAATATACTT TTGTGAACAA AAATTGAGAC ATTTACATTT TCTCCCTATG TGGTCGCTCC 2880 

AGACTTGGGA AACTATTCAT GAATATTTAT ATTGTATGGT AATATAGTTA TTGCACAAGT 2940 
TCAATAAAAA TCTGCTCTTT GTATAACAGA AAAA 

Seq ID NO: 40 Protein sequence: 
Protein Accession #: NPJ) 043 54.1 

1 11 21 31 41 51 

I I I I I I 

MESPSAPPHR WCIPWQRLItl* TASLLTFWNP PTTAKLTIES TPFNVAEGKE VLLLVHNLPQ 60 

HLFGYSWYKG ERVDGNRQII GYVIGTQQAT PGPAYSGREI IYPNASLLIQ NIIQNDTGFY 120. 

TLHVIKSDLV NEEATGQFRV YPELPKPSIS SNNSKPVEDK DAVAPTCEPE TQDATYLWWV 180 

NNQSLPVSPR LQLSNGNRTL TLFNVTRNDT ASYKCETQNP VSARRSDSVI LNVLYGPDAP 240 

TISPLNTSYR SGENLNLSCH AASNPPAQYS WPVNGTFQQS TQELFIPNIT VNNSGSYTCQ 300 

AHNSDTGLNR TTVTTITVYA EPPKPPITSN NSNPVEDEDA VALTCEPEIQ NTTYLWWVNN 360 

QSLPVSPRLQ LSNDNRTLTL LSVTRNDVGP YECGIQNELS VDHSDPVILN VLYGPDDPTI 420 

SPSYTYYRPG VNLSLSCHAA SNPPAQYSWL IDGNIQQHTQ ELFISNITEK NSGLYTCQAN 480 

NSASGHSRTT VKTITVSAEL PKPSISSNNS KPVEDKDAVA FTCEPEAQNT TYLWWVNGQS 540 

LPVSPRLQLS NGNRTLTLFN VTRNDARAYV OGIQNSVSAN RSDPVTLDVL YGPDTPIISP 600 

PDSSYLSGAN LNLSCHSASN PSPQYSWRIN GIPQQHTQVL FIAKITPNNN GTYACFVSNL 660 
ATGRNNSIVK SITVSASGTS PGLSAGATVG IMIGVLVGVA LI 
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Seq ID NO i 41 DNA sequence 

Nucleic Acid Accession ft: NM_006952.l 

Coding sequence: 11-793 

1 11 21 31 . 41 51 
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AATCCCGACA ATGGCGAAAG ACAACTCAAC TGTTCGTTGC TTCCAGGGCC TGCTGATTTT 60 

TGGAAATGTQ ATTATTGGTT GTTGCGGCAT TGCCCTGACT GCGGAGTGCA TCTTCTTTGT 120 

ATCTGACCAA CACAGCCTCT ACCCACTGCT TGAAGCCACC GACAACGATG ACATCTATGG 180 

GGCTGCCTGG ATCGGCATAT TTGTGGGCAT CTGCCTCTTC TGCCTGTCTG TTCTAGGCAT 240 

TGTAGGCATC ATGAAGTCCA GCAGGAAAAT TCTTCTGGCG TATTTCATTC TGATGTTTAT 300 

AGTATATGCC TTTGAAGTGG CATCTTGTAT CACAGCAGCA ACACAACGAG ACTTTTTCAC 360 

ACCCAACCTC TTCCTGAAGC AGATGCTAGA GAGGTACCAA AACAACAGCC CTCCAAACAA 420 

TGATGACCAG TGGAAAAACA ATGGAGTCAC CAAAACCTGG GACAGGCTCA TGCTCCAGGA 480 

CAATTGCTGT GGCGTAAATG GTCCATCAGA CTGGCAAAAA TACACATCTG CCTTCCGGAC 540 

TGAGAATAAT GATGCTGACT ATCCCTGGCC TCGTCAATGC TGTGTTATGA ACAATCTTAA 600 

AGAACCTCTC AACCTGGAGG CTTGTAAACT AGGCGTGCCT GGTTTTTATC ACAATCAGGG 660 

CTGCTATGAA CTGATCTCTG GTCCAATGAA CCGACACGCC TGGGGGGTTG CCTGGTTTGG 720 

ATTTGCCATT CTCTGCTGGA CTTTTTGGGT TCTCCTGGGT ACCATGTTCT ACTGGAGCAG 7 BO 
AATTGAATAT TAAGAA 

Seq ID NO: 42 Protein sequence: 
Protein Accession #: NP_008883.1 

1 11 21 31 41 51 

I I I I 1 I 

MAKDNSTVRC FQGLLIFGNV IIGCCGIALT AECIFFVSDQ HSLYPLLEAT DNDDIYGAAW 60 
IGIFVGICLP CLSVLGIVGI MKSSRKILLA YFILMFIVYA FEVASCITAA TQRDFFTPNL 120 
FLKQMLERYQ NNSPPNNDDQ WKNNGVTKTW DRLMLQDNCC GVNGPSDWQK YTSAFRTENN 180 
DADYPWPRQC CVMNNLKEPL NLEACKLGVP GPYHNQGCYE LISGPMNRHA WGVAWPGFAI 240 
LCWTFWVIiLG TMPYWSRIEY 



Seq ID NO: 43 DNA sequence 
■ Nucleic Acid Accession ft: Eos sequence 
Coding sequence: 83-2605 

1 11 21 31 41 51 

I I I I I I 

GCCGGACAGA TCTGCGCGTA TCCTGGAGCC GGCCCAGTTG TGAACTAGGA GAGCTTTGGG 60 

ACCTCTGTCC CAAGCAAGAG AGATGAATGG AGAGTATAGA GGCAGAGGAT TTGGACGAGG 120 

AAGATTTCAA AGCTGGAAAA GGGGAAGAGG TGGTGGGAAC TTCTCAGGAA AATGGAGAGA 180 

AAGAGAACAC AGACCTGATC TGAGTAAAAC CACAGGAAAA CGTACTTCTG AACAAACCCC 240 

ACAGTTTTTG CTTTCAACAA AGACCCCACA GTCAATGCAG, TCAACATTGG ATCGATTCAT 300 

ACCATATAAA GGCTGGAAGC TTTATTTCTC TGAAGTTTAC AGCGATAGCT CTCCTTTGAT 360 

TGAGAAGATT CAAGCATTTG AAAAATTTTT CACAAGGCAT ATTGATTTGT ATGACAAGGA 420 

TGAAATAGAA AGAAAGGGAA GTATTTTGGT AGATTTTAAA GAACTGACAG AAGGTGGTGA 480 

AGTAACTAAC TTGATACCAG ATATAGCAAC TGAACTAAGA GATGCACCTG AGAAAACCTT 540 

GGCTTGCATG GGTTTGGCAA TACATCAGGT GTTAACTAAG GACCTTGAAA GGCATGCAGC 600 

TGAGTTACAA GCCCAGGAAG GATTGTCTAA TGATGGAGAA ACAATGGTAA ATGTGCCACA 660 

TATTCATGCA AGGGTGTACA ACTATGAGCC TTTGACACAG CTCAAGAATG TCAGAGCAAA 720 

TTACTATGGA AAATACATTG CTCTAAGAGG GACAGTGGTT CGTGTCAGTA A TATA AAGCC 780 

TCTTTGCACC AAGATGGCTT TTCTTTGTGC TGCATGTGGA GAAATTCAGA GCTTTCCTCT 840 

TCCAGATGGA AAATACAGTC TTCCCACAAA GTGTCCTGTG CCTGTGTGTC GAGGCAGGTC 900 

ATTTACTGCT CTCCGCAGCT CTCCTCTCAC AGTTACGATG GACTGGCAGT CAATCAAAAT 960 

CCAGGAATTG ATGTCTGATG ATCAGAGAGA AGCAGGTCGG ATTCCACGAA CAATAGAATG 1020 

TGAGCTTGTT CATGATCTTG TGGATAGCTG TGTCCCGGGA GACACAGTGA CTATTACTGG 1080 

AATTGTCAAA GTCTCAAATG CGGAAGAAGG TTCTCGAAAT AAGAATGACA AGTGTATGTT 1140 

CCTTTTGTAT ATTGAAGCAA ATTCTATTAG TAATAGCAAA GGACAGAAAA CAAAGAGTTC 1200 

TGAGGATGGG TGTAAGCATG GAATGTTGAT GGAGTTCTCA CTTAAAGACC TTTATGCCAT 1260 

CCAAGAGATT CAAGCTGAAG AAAACCTGTT TAAACTCATT GTCAACTCGC TTTGCCCTGT 1320 

CATTTTTGGT CATGAACTTG TTAAAGCAGG TTTGGCATTA GCACTCTTTG GAGGAAGCCA 1380 

GAAATACGCA GATGACAAAA ACAGAATTCC AATTCGGGGA GACCCCCACA TCCTTGTTGT 1440 

TGGAGATCCA GGCCTAGGAA AAAGTCAAAT GCTACAGGCA GCGTGCAATG TTGCCCCACG 1500 

TGGCGTGTAT GTTTGTGGTA ACACCACGAC CACCTCTGGT CTGACGGTAA CTCTTTCAAA 1560 

AGATAGTTCC TCTGGAGATT TTGCTTTGGA AGCTGGTGCC CTGGTACTTG GTGATCAAGG 1620 

TATTTGTGGA ATCGATGAAT TTGATAAGAT GGGGAATCAA CATCAAGCCT TGTTGGAAGC 1680 

CATGGAGCAG CAAAGTATTA GTCTTGCTAA GGCTGGTGTG GTTTGTAGCC TTCCTGCAAG 1740 

AACTTCCATT ATTGCTGCTG CAAATCCAGT TGGAGGACAT TACAATAAAG CCAAAACAGT 1800 

TTCTGAGAAT TTAAAAATGG GGAGTGCACT ACTATCCAGA TTTGATTTGG TCTTTATCCT 1860 

GTTAGATACT CCAAATGAGC ATCATGATCA CTTACTCTCT GAACATGTGA TTGCAATAAG 1920 

AGCTGGAAAG CAGAGAACCA TTAGCAGTGC CACAGTAGCT CGTATGAATA GTCAAGATTC 1980 

AAATACTTCC GTACTTGAAG TAGTTTCTGA GAAGCCATTA TCAGAAAGAC TAAAGGTGGT 2040 

TCCTGGAGAA ACAATAGATC CCATTCCCCA CCAGCTATTG AGAAAGTACA TTGGCTATGC 2100 

TCGGCAGTAT GTGTACCCAA GGCTATCCAC AGAAGCTGCT CGAGTTCTTC AAGATTTTTA 2160 

CCTTGAGCTC CGGAAACAGA GCCAGAGGTT AAATAGCTCA CCAATCACTA CCAGGCAGCT 2220 

GGAATCTTTG ATTCGTCTGA CAGAGGCACG AGCAAGGTTG GAATTGAGAG AGGAAGCAAC 2280 

CAAAGAAGAC GCTGAGGATA TAGTGGAAAT TATGAAATAT AGCATGCTAG GAACTTACTC 2340 

TGATGAATTT GGGAACCTAG ATTTTGAGCG ATCCCAGCAT GGTTCTGGAA TGAGCAACAG 2400 

GTCAACAGCG AAAAGATTTA TTTCTGCTCT CAACAACGTT GCTGAAAGAA CTTATAATAA 2460 

TATATTTCAA TTTCATCAAC TTCGGCAGAT TGCCAAAGAA CTAAACATTC AGGTTGCTGA 2520 

TTTTGAAAAT TTTATTGGAT CACTAAATGA CCAGGGTTAC CTCTTGAAAA AAGGCCCAAA 2580 

AGTTTACCAG CTTCAAACTA TGTAAAAGGA CTTCACCAAG TTAGGGCCTC CTGGGTTTAT 2640 

TGCAGATTAA AGCCATCTCA GTGAAGATAT GCGTGCACGC ACAGACAGAC AGACACACAC 2700 

ACACACACAC ACACACACAC ACACACACAC ACACACAGTC AAATACTGTT CTCTGAAAAA 2760 

TGATGTCCCA AAAGTATTAT AATAGGAAAA AAGCATTAAA TATAATAAAC TAATTTAAGA 2820 
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WO 02/086443 

AGTGATAAAG TCTCCAGATG CAGTAGCTCA 
GGTGAGAGGA TTCCTTGAQG CCAGGGTTCG 
CATTTCTT AA AAAAAAAAAA AAAAAATTTA 
TAGTCTCAGC TACTTGTQAQ GCTGAG6CAG 
TACAGTGAGC CACAATCACA CCAATCACTG 
GACTCAAAAA AATAAAAAAA ATTGTAGTGG 
CCAAAGGGCT AAAAGTAAAT TACTTATAAA 
TTATATGTAT GAATATTTCA TAGTTTTGCA 
AAACCAATGA ATATATTACA TATTCTGTGT 
ATTTGAATTT CATAAAATTT TCCCATGTCA 
GCTATTTAAT AATAGGTCTC ATTTATTCCA 
AATAGAAACA GACTGATTAA GCAGGAGAAG 
AATTATTAGA AGGCAGGTGA ACCAGGAGGG 
GCCTTAGAAT TGGACTAAGG AAGAAGCTGC 
AGAAAGTGCT GCTGCCTCCC TGCCCCACCT 
GAATGCCCCC ACCCGCACCG GAACAGCAAC 
ATTGCTGAAT TCAAAAAAGA AGTTGCATAC 
TATGCCCCTT TCATAGGCTG CTAGGGAGTT 
AATAAGACCA GAATTTCTCA TATGTTGTGA 
AAACTATCAA TCATGTATAA ATCCAACAAA 
GTGAACCATT GTTGGAGAAT CTACTAAAAT 
AATGTAAATA AAAAGAACTG GCAGTGTATA 
GATGTGGAGA CTATTGCCAT AGACCACAAT 
GGAATCAAAA GGGGCCAGGT GCAGTGGCTC 
GAGGCAGGAG GATCACTTGA AGCCAGTTTT 
TATCTCTACA AAAAATAGAT TAGCTGGGCA 
GTGGAGGCTG AAGTAGGAAA TCACTTGAGC 
TTATACCACT GCACTCCAGC CTGGGCAAGA 

Seg ID NO: 44 Protein sequence: 
Protein Accession #: CAB55276.2 



PCT/US02/12476 



i 
I 

MNGEYRGRGF 
TPQSMQSTLD 
ILVDFKELTE 
LSNDGETKVN 
LCAACGEIQS 
QREAGRIPRT 
SISNSKGQKT 
KAGLALALFG 
TTTTSGLTVT 
LAKAGWCSL 
HDHLLSEHVI 
IPHQLLRKYI 
EARARLELRE 
SALNNVAERT 



11 

I 

GRGRFQSWKR 
RFIPYKGWKL 
GGEVTNLIPD 
VPHIHARVYN 
FPLPDGKYSL 
IECELVHDLV 
KSSEDGCKHG 
GSQKYADDKN 
LSKDSSSGDF 
PARTS I I AAA 
AIRAGKQRTI 
GYARQYVYPR 
EATKEDAEDI 
YNNIFQFHQL 



21 

! 

GRGGGNFSGK 
YFSEVYSDSS 
IATELRDAPE 
YEPIiTQLKNV 
PTKCPVPVCR 
DSCVPGDTVT 
MLMEFSLKDL 
RIPIRGDPHI 
ALEAGALVLG 
NPVGGHYNKA 
SSATVARMNS 
LSTEAARVLQ 
VEIMKYSMLG 
RQIAKELNIQ 



CACTGTAATC 
AGACCAACCT 
AACTTAGCTG 
GAGGATTCTT 
CACTCCAGCC 
TAGCCATGTG 
TTTTTTATAG 
TATCAGATGT 
TCCAATAAAA 
AGAATACAAA 
CAGGCTGTAG 
TTTTTTGAAA 
TAAGCTTCCA 
TGACACTCCA 
TTGCCACTTC 
AAAAGGATTC 
AAAGACATCT 
TTCCTGGTTC 
GAGGATTCAA 
CACTTTGTAA 
ACGGCTTCCC 
TCAGATGTTT 
GTAAATTTTT 
ACATCTATAA 
GAGACCAGCC 
CGGTGGTGCA 
CCGAGAGTTT 
GAGCAAGACC 



31 
I 

WREREHRPDL 
PLIEKIQAFE 
KTLACMGLAI 
RANYYGKYIA 
GRSFTALRSS 
ITGIVKVSNA 
YAIQEIQAEE 
LWGDPGLGK 
DQGICGIDEF 
KTVSENLKMG 
QDSNTSVLEV 
DFYLELRKQS 
TYSDEFGNLD 
VADFENFIGS 



ACAGTGACTC 
TGGGCAACAT 
GGTATGGTGG 
TGAGCCCAGG 
TGGGCAATAA 
TTAATTGTTA 
TTGTATTTTT 
AGGCATACAG 
' CTTTATTTAT 
ATACTTGAGT 
TTTGTAGTCT 
GAATTTTGTT 
GCAGCAATTT 
CTGCCACACA 
TGCAGCAGGA 
TGCATGAGAT 
GATTGAAAAA 
TACTTTCAGG 
ATGTTACAGG 
CATACAAGAA 
GCAAACGAAG 
AACTATAGGA 
AAGTGAGGAA 
TCCCAGAGCT 
TATGCAACAC 
TGCCTATTGT 
GAGGTTACAG 
TTGTCTCTT 



41 

I 

SKTTGKRTSE 
KFFTRH1DLY 
RQVIiTKDLER 
LRGTWRVSN 
PLTVTMDWQS 
EEGSRNKNDK 
NLFKLIVNSL 
SQMLQAACNV 
DKMGNQHQAL 
SALLSRFDLV 
VSEKPLSERL 
QRLNSSPITT 



LNDOGYLLKK 



AGGAGGCTGA 
AGCAAGACCC 
CACATGCCTA 
AGTTTGAGGT 
AGTAACTCTT 
AATAAATTCT 
GACCTGCCTT 
ACAAATACAT 
GGACACTAAA 
TTTGTTTTTA 
TGCTTGAAAC 
TGGCTCACGG 
GTAAAACCAT 
GGGCACTGGA 
ATAGGTAGAA 
GCCTCCCTAA 
GGGTATGTTA 
TGGTGGGATC 
GTTGCCAGCC 
CTCAGGAAAT 
ATGAATGGAA 
CCAGAACTAA 
GGAAAAATCA 
TTGGGAGTTC 
ATTGAGACCC 
CCTACCTACT 
TGAGCTATGA 



51 
I 

QTPQFLLSTK 
DKDEIERKGS 
HAAELQAQEG 
IKPLCTKMAF 
IKIQELMSDD 
CMFLLYIEAN 
CPVIFGHELV 
APRGVYVCGN 
LEAMEQQSIS 
FILLDTPNEH 
KWPGETIDP 
RQLESLIRLT 
SNRSTAKRFI 
GPKVYQLQTM 



Seq ID NO: 45 DNA sequence 

Nucleic Acid Accession #: NMJ)05416.i 

Coding sequence: 149.. 658 



ACCAGATCCC 
CTGAAGACCA 
AAAGAGTGTG 
CCCACCACCT 
AATATTTGTT 
AAAGATTCCA 
GCCAGGCTGT 
CAAGGTCCCT 
ACCAGGCAGC 
CAAAGTTCCT 
GCCATGTCCT 
TGGTGCACAG 
TGTTTCTGTG 
AGTCTCTCTC 
CTGAAGAATC 
GGCTGCTCAG 
CTCATTAAAT 



11 
I 

AGAGGCTGAA 
GAAAAGCCAC 
TCCACGATCC 
CAGCTTCAAC 
CCCACAACCA 
GAGCCAGGCT 
ACCAAGGTCC 
GAGCCAGGTT 
ATCAAGGTCC 
GAGCAAGGAT 
TCAACGGTCA 
ACAAGCCCTT 
TCTTAATTGT 
TTATTTGTAT 
CTGTAAGCCC 
GGTTCATCTG 
TGCTTTTAAT 



21 
I 

CACCTCGACC 
TAAGACTTTC 
TTTGAAGCAT 
AGCAGCAGGT 
AGGAGCCATG 
GTACCAAGGT 
CTGAGCCAGG 
GTACCAAGGT 
CTGACCAAGG 
ACACCAAAGT 
CTCCAGGCCC 
GAGAAGCCAA 
CTGTAGACCT 
CCTAAAAATA 
CTGAATTAAG 
AAGATTCGAA 
TCCA 



31 
I 

TTCTCTGCAC 
TGCTTAATTC 
GAGTTCTTAC 
GAAACAACCC 
CCACTCAAAG 
CCCTGAGCCA 
TTGTACCAAG 
CCCTGAGCCA 
CTTCATCAAG 
TCCTGTGCCA 
AGCTCAGCAG 
CCACCAGATG 
TGTAATCAGC 
CGTACTATAA 
CAGAAAGTCT 
TGAAAAGAAA 



41 
I 

AGCAGATGAT 
AGGAGCTTAG 
CAGCAGAAGC 
AGCCAGCCTC 
GTTCCACAAC 
GGCTGTACCA 
GTCCCTGAGC 
GGCTACACCA 
TTTCCTGAGC 
GGCTACACAA 
AAGACCAAGC 
CTGGACACCC 
ACATTGTCAC 
AGCTTTTGTT 
TCATGGCTTT 
TGCATGTTTC 



51 
I 

CCCTGAGCAG 
AGGATTCTTC 
AGACCTTTAC 
CACCTCAGGA 
CTGGAAACAC 
AGGTCCCTGA 
CAGGCTGTAC 
AGGTCCCTGA 
CAGGTGCCAT 
AGCTACCAGA 
AGAAGTAATT 
TCTTCCCATC 
CCCAAGCCAT 
CACACACACT 
TCTGGTCTTC 
CTGCTCTTCC 



Seq ID NO: 46 Protein sequence: 
Protein Accession #: NP_005407.1 

21 



51 



1 11 21 31 41 

I I I I I 

MSSYQQKQTF TPPPQLQQQQ VKQPSQPPPQ EIFVPTTKEP CHSKVPQPGN TKIPEPGCTK 
VPEPGCTKVP EPGCTKVPEP GCTKVPEPGC TKVPEPGCTK VPEPGYTKVP EPGSIKVPDQ 
GFIKFPEPGA IKVPEQGYTK VPVPGYTKLP EPCPSTVTPG PAQQKTKQK 

u 

Seq ID NO: 47 DNA sequence 

Nucleic Acid Accession ft: Eos sequence 



2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



60 
120 
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10 
15 
20 

25 
30 
35 
40 
45 
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80 
85 



GCGTCGTGTG 
AAGGCTCGTT 
TTGCAATTAA 
GAAAATGGAT 
TTTGGGGAAA 
AGTCGGCGTT 
TAAGGATAAC 
TTGGAGCTGC 
CTAAAAACTT 



CAGGCGTCCC 
AGAATTC6CC 
GCTTAGGGAA 
TAGAGAAACT 
GTGCCCCGAC 
GGCGGCAGGG 
ATCCTGGAAA 
CCTGTGGAGT 
TGTGAGAATT 



21 
I 

CGGGCTGTGG 
CTAGAGCTGT 
CCAGCAACAA 
TCTTCCCOGA 
CGCAGAGGCG 
GTGGCCTTCC 
TGACTTCTGT 
TACAGTTTAC 
TTCTTTTACT 



31 
I 

ATAATTAGAC 
ATCATGTATT 
AAGCAAACTT 
TTTAAGGGGA 
ACGACAGGGG 
TCATCTGGGC 
ACGGTTTGAG 
CAAACACATT 
AAAATTTTTT 



Seq ID NO: 48 DKA sequence: 

Nucleic Acid Accession fl: CAT cluster 



TTCCAAATTT 
TTTTAGTAAA 
CTCCAAGTCA 
TCCTTACTCT 
CCGACTACCG 
CCCAAAGCGC 
ATTTTCGCGG 
TTGCAAGCAA 
AGCCTTGGGC 
CGACGCT 



11 
I 

TTTTTTTTGT 
TGAGATTATG 
TGAGTGTGCA 
TCTCGGAGCC 
TGAGCAGCTT 
TGGCCGCAGG 
TGAACGACCT 
AGTTAATTTG 
AATGAGGGAA 



21 
I 

AATAAGAAAA 
TTCATGAATG 
GTTGGGCTCA 
CACATCGCCC 
CCTGCTCCCC 
AATCTTTCCC 
CGGGCCAAGT 
AAAGAAAATA 
GAACGTGTCT 



31 
I 

AATTTTAGTA 
TGTTTGGTAA 
AACCGTACAG 
AGATGAGGAA 
TGTCGTCGCC 
CTTAAATCGG 
TTGCTTTTGT 
CATGATACAG 
AGTTATCCAC 



41 
I 

ACGTTCTTCC 
TTCTTTCAAA 
GGCCCGAGGT 
AAGATTCCTG 
AGCAGGAAGC 
GATGTGGGCT 
CCCAACTGCA 
CATGAACATA 
CTTATTACAA 



41 
I 

AAAGAAAATT 
ACTGTAACTC 
AAGTCATTTC 
GGCCACCGCT 
TCTGCGGTCG 
GGAAGAAGTT 
TGCTGGTTCC 
CTCTAGGGCG 
AGCCCGGGGA 



51 
I 

CTCATTGCCC 
TTAACTTTGC 
CGTTCACCGC 
CGGCCAGCGC 
TGCTCACGGT 
CCTAGAAGAG 
CACTCATGAC 
ATCTCATTTA 
A 



51 
I 

CTCACAAAGT 
CACAGGGCAG 
CAGGATGTTA 
GCCGCCAACG 
GGGCACTTTC 
TCTCTAATCC 
CTAAGCTTAA 
AATTCTAACG 
CGCCTGCACA 



Seq ID NO: 49 DNA sequence 

Nucleic Acid Accession ft: CAT cluster 



TCTTTCTTCT 
CCTGCCGACC 
ACCGTAGACC 
GACACATGCA 
CCCAACCAAA 
TTCATTTAAA 
CTTTCTCTGA 
TCGCTGTAGC 
TAAAGAAACT 
GATTGAACCA 
TGCTGGTATC 
GACGTGGGGA 
CCGGGTCTCT 
GAGACAGCAT 
ACTG 



11 
I 

GCTGCTCGTT 
TCTGTTGTCT 
CGAAACCATT 
GACCAGTTTT 
GTGTTTAAAA 
AAACTCTAAT 
TCTGTGTCTT 
CATGGGAATC 
GACACAGGAG 
GTGCACTCCA 
GTCCTGCAGC 
GAGCTGGTCT 
CCTGGCCCCG 
CATTTATGAG 



21 
I 

TGTCTCTCCT 
CTTCTCTGAT 
GGGTGTCACA 
CCTGGAACNG 
CTTTTTAGGG 
ATTTATATTA 
TTTTCTTTGA 
CGTTTCATTA 
AATCACTTGA 
GCCTTGGCAG 
CCCATCCTCG 
ATATATCCGG 
GGGACCTAGT 
CCTGCAGCAT 



31 
I 

GTGCTCTTCT 
GGCGGGGGGC 
AGCCGGTCGC 
CATGACCATG 
CACCCCCAAA 
AATACAAAGA 
CAGCATCTCC 
TTATGGTAGC 
ACTTGGGAGG 
CGGAGCAAGA 
GTTCCATTGC 
GTGAAGCTCA 
ATTTTTGCCA 
CCACCCTACT 



41 
I 

TCTTTCTTTC 
GGGAGAAGCT 
CGGCTTTTTT 
TTATTACTAT 
ATTTTTTTTT 
TACCCAAACC 
ATTTTTTTTC 
AATATGGAGT 
CAGAGTTTGC 
TTCTGTCACA 
GCTGCCAGGC 
GCTGTGGCAC 
CGAGTGTACA 
GCTGTATCCA 



51 
I 

CCTCGCCGCT 
GACCGGTGAG 
GGGAGAACCC 
GGGCCGCCTC 
TTTTTTTTTT 
CTTTATGCTT 
TGCTGCTTCA 
GCTGTATTCC 
AGTGAGCCGA 
GTTCCTGAAG 
AGGGTGCTGG 
ACCTTGGATG 
CCAAACAAAG 
GTTTCCATTG 



Seq ID NO: 50 DNA sequence 
Nucleic Acid Accession #: L05187 
Coding sequence: 1991.. 2260 



CTGCAGGGAG 
TCAGAAAGGA 
CAGAAGAAGG 
TGAAGGAAAG 
AGAGTCATAA 
GGAAATGGAT 
ATTTCTAGCT 
CCCCTCCCTT 
ACAACCATCT 
CCAGGGTTAA 
CCCTGCACCT 
CAGCCAGCTA 
GATGAGGATG 
AGCTTCTATT 
TCACACCAAA 
ATTGCAACAA 
ATATGTGTAA 
TATTTTAAGT 
CCTCAGTAGA 
AGTTCATAGC 
TGACAAGATA 
ATTTAAGGCA 
AACATAAAAC 
AGTAATTGGC 
AGGAGACCTC 
AGATGGGAAG 
GAGGCTTAGA 
GAGGAAAGTG 
GAAGCCAGCT 
GAGCCAAGAA 
TTCAAAGGGC 



11 
I 

GCAGGTAGAA 
GGAAAAGGCC 
ATTAGCCCCT 
CAGGTTTTCC 
GTAAATTATT 
GGAAGGTCTT 
TCCACCTTCA 
TCCCACCTAT 
CAATGACAAG 
CTCATGAAAC 
GGGTCTGAGG 
GTGCCAAAAA 
GTAGTGTGAG 
TCCTTGAGGC 
CCCAAGGGAC 
ACTGGCAATT 
GCAGGTTAAT 
TAAATTACAG 
TAGTCATTGA 
AGAACTAGAA 
TTTATAGAAA 
GTATGCTAGG 
CTAGCAGGAA 
ATGACGGAGA 
TAGGGTGTCA 
AAAAGCATTT 
TGAATATAAA 
GTCTGATGCC 
TTAGTAGGGC 
GAGAACTCCA 
CTGAAAATTA 



21 
I 

AAGGCTTTTG 
AGGGCAGATG 
GAAAGTCCCT 
CAGATTAGCA 
CTGAATGTGT 
GGACTCTGAG 
CCAAGGCAGA 
TCATGTGTGC 
GACAGCAGGT 
CCTCCATGAA 
ATGAGGGTGG 
ATATCAGGTG 
TCATGTGTGA 
AGGGCTCATT 
CACACAGCCC 
CTAGTGTACT 
CCAGGGTTTC 
TCTGGATTTG 
ACTGGGAGTC 
CTCAGGCCAG 
TTTTAATTTA 
CACTTTGGAC 
GGTAATACAT 
TGGGCAGAGA 
AGTGATGTGA 
GGAAGGGACT 
GCCATCCTAT 
ATTTTCCAAA 
ATTTTTCCAG 
ATAAAATGGA 
TCCAAGCTTA 



31 

I 

GGTTTTCAGG 
TCTGGGTGGA 
GAAGTAGGAG 
ACCAGTCAGG 
GTAGTTTAAT 
ACAAGGGGTC 
CAAGGAGGGC 
AAGAGTGCCC 
GGCAAGGCTC 
GCCTGCTGCT 
CAGTGAAAAT 
GTGTTCATCA 
CAGGTGAGGA 
CATCTTATAA 
ATTCTGCTCC 
TTTTCATTAT 
AATGGGAGAT 
AAAGGACCTT 
CTGGAGAAGA 
AGCACTCTCA 
TTAGATGGAT 
AAATCAATGC 
ATATATAAAT 
AGGGCTGTGC 
GCTATGATGG 
GTGTAAGCAC 
AAGTCACAGG 
AGACCTAATA 
AACAGATATA 
GCAGAAGAAA 
TTTCATTTTT 



41 
I 

TGGGGGGCAG 
GTGAAGGGAA 
AAGGGTAAAG 
GGGAGGAAGG 
GGAATTGGGA 
TATAATCAGT 
CCACCTCAGC 
TGTCCCACAG 
AACAGGACTC 
CACCCCTCCC 
TAGGCCAGTG 
AATAAGCCGA 
ATGAAAACAG 
AAGCCAGCTG 
GTATACCAGG 
TAGAAATTAG 
AGAGAATAGT 
AGAGATGGTT 
TTGTTCAAAT 
GTAACACTGC 
CTCTACTGAG 
CCTAACGTAC 
AAATGAAATG 
ACTTTTGGGA 
AGGGGTATTT 
AGACCAGAAG 
CTTTCTACAT 
TGCGGACCTC 
AGGTGCCTTG 
TTGCCTTTTA 
AAATGTAATG 



51 
I 

TCTAGCCTGA 
AAAGTGATCC 
GTGTGGTTGG 
TGAGAGTGGG 
AAAAGATGGG 
CCATTTCATT 
TCCTCTGCTC 
AACACGGGGA 
AGATGTCCCC 
TCAAGGCAAG 
ACATCATTTT 
GCCAACCGGT 
AGTGCCCGAG 
GCCATTGCCT 
TAAGTCTCTG 
CTAAAGGCAA 
GGAATATCTT 
AGGGCTCCCA 
GCCCATGGGA 
AATTTCCCCC 
CATTTATTCC 
TTACTTAACA 
CAAAGTAGAT 
GACTTGCTCA 
GGACAAGCAG 
CAAAACCATA 
GGTACTAGGA 
ATGTCCCTCA 
GGTAGGAAGG 
GCTCCTCCTC 
GGGGAGCTAA 



60 
120 
180 
240 
300 
360 
420 
480 



60 
120 
180 
240 
300 
360 
420 
480 
540 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

GGGAGATGAA AGGCTTTCTC TTCTAAAGGG 
TTGTATCCAT CTTTCTTTAA TTGAATCACT 
CATTTGAAGC ATGAATTCTC AGCAGCAGAA 
6CAGCA6CAG GTGAAACAAC CTTGCCAGCC 
CAAGGAGCCC TGCCAACCCA AGGTGCCTGA 
CCAGCCCAAG ATTCCAGAGC CCTGCCAGCC 
CACTCCAGCA CCAGCCCAGC AGAAGACCAA 
TTGAGGAGCT GGCCACTGGA TACTGAACAC 
GCCTATTGAC CCTGCAGTrA GCATGCTGTC 
CTAAAAAGAT GTCCCTTACC CTCATTCTGG 
GTCTCACTGA CTGAGCTAGT CTTCTTGTTG 
AGGTCAAGTG ACCATCCCTA G 

Seq ID NO: 51 Protein sequence: 
Protein Accession #:AAC26838 



PCT/US02/12476 



TCCTGAAATA 
GTGTCAGCTT 
GCAGCCTTGC 
TCCACCCCAG 
GCCCTGCCAC 
CAAGGTGCCT 
GCAGAAGTAA 
CCTACTCCAT 
ACCCTGAATC 
AGGCTCCTGA 
CTCGGGTGCA 



AAATCTGTTT 
TCTGTCTCTA 
ACCCCACCCC 
GAACCATGCA 
CCCAAAGTGC 
GAGCCCTGCC 
TGTGGTCCAC 
TCTGCTTATG 
ATAATCGCTC 
GCCTCTGCGT 
TTTGAGGATG 



GGCATTGAAT 
GAAAAAAACA 
CTCAGCCTCA 
TCCCCAAAAC 
CTGAGCCCTG 
CTTCAACGGT 
AGCCATGCCC 
AATCCCATTT 
CTTTGCACCT 
AAGGCTGAAC 
GATTTGGGGA 



1* - 11 21 31 41 51 

I 111 I I 

MNSQOQKQPC TPPPQPQQQQ VKQPCQPPPQ EPCIPKTKEP CQPKVPEPCH PKVPBPCQPK 
IPEPCQPKVP BPCPSTVTPA PAQQKTKQK 



Seq ID NO: 52 DNA sequence 

Nucleic Acid Accession ft: NM_002638.1 

Coding sequence: 120-473 



I 

CAATACAGCT 
GCTGGACTGC 
TGAGGGCCAG 
AGGCAGCTGT 
TCAATGGACA 
CGCAAGAGCC 
TCCGGTGCGC 
TCAAGAAGTG 
CGGTCCTTGC 
TGCTGCCCTT 
GAGCTGCCTC 



11 
1 

AAGGAATTAT 
ATAAAGATTG 
CAGCTTCTTG 
CACGGGAGTT 
AGATCCCGTT 
AGTCAAAGGT 
CATGTTGAAT 
CTGTGAAGGC 
TGCACCTGTG 
CCCCTTCCCA 
TCTCATCCAC 



21 

I 

CCCTTGTAAA 
GTATGGCCTT 
ATCGTGGTGG 
CCTGTTAAAG 
AAAGGACAAG 
CCAGTCTCCA 
CCCCCTAACC 
TCTTGCGGGA 
CCGTCCCCAG 
CACTGTCCAT 
TTTCCAATAA 



31 
I 

TACCACAGAC 
AGCTCTTAGC 
TGTTCCTCAT 
GTCAAGACAC 
TTTCAGTTAA 
CTAAGCCTGG 
GCTGCITGAA 
TGGCCTGTTT 
AGCTACAGGC 
TCTTCCTCCC 
A 



41 
I 

CCGCCCTGGA 
CAAACACCTT 
CGCTGGGACG 
TGTCAAAGGC 
AGGTCAAGAT 
CTCCTGCCCC 
AGATACTGAC 
CGTTCCCCAG 
CCCATCTGGT 
ATTCAGGATG 



51 
I 

GCCAGGCCAA 
CCTGACACCA 
CTGGTTCTAG 
CGTGTTCCAT 
AAAGTCAAAG 
ATTATCTTGA 
TGCCCAGGAA 
TGAAGGGAGC 
CCTAAGTCCC 
CCCACGGCTG 



Seq ID NO: 53 Protein sequence: 
Protein Accession ft: NP_002629.1 



11 



21 



31 



51 



41 

I I I I I I 

MRASSPLIW VFLIAGTLVL EAAVTGVPVK GQDTVKGRVP FNGQDPVKGQ VSVKGQDKVK 
AQEPVKGPVS TKPGSCPIIL IRCAMLNPPN RCLKDTDCPG IKKCCEGSCG MACFVPQ 

Seq ID NO: 54 DNA sequence 

Nucleic Acid Accession #: NMJ319618 

Coding sequence: 75-584 



GGCACGAGCC 
GAGACAACCA 
ATCAATCAAT 
CCCTTCAGGG 
TTGCTGTTAT 
ATTTGGGAAT 
CATTGCAGCT 
CCTTCCTTTT 
CGGACTGGTT 
GGAAGTCATA 
GCAGCTTGGT 
AGTGTCATTT 
TAATGAAGAA 
GGAGAGCTGG 
CTGCATGAGT 
TGAAGATGCT 
CTCTGTTTCT 
CCAATATACC 
TAATTCTTGT 
AATAAACTTT 



11 
I 

ACGATTCAGT 
CACTATGAGA 
GTGTAAACCT 
TCAGAACCTT 
CACATGCAAG 
CCAGAATCCA 
AAAAGAGCAG 
CTACCGTGCC 
CATTGCCTCC 
CAACACTGCC 
CTTTGTCTTA 
TCACGCTGGT 
GAAGCAATTA 
GTGGTATAAG 
GACTTTAAGA 
TCAGAGCTCA 
GTTTTGCTTT 
TCATTGTGTG 
GTTAAGTTAA 
GTGTATTTAT 



21 



CCCCTGGACT 
GGCACTCCAG 
ATTACTGGGA 
GTGGCAGTTC 
TATCCAGAGG 
GAAATGTGTT 
AAGATCATGG 
AAGACTGGTA 
TCCAAGAGAG 
TTTGAATTAA 
AAGTTTCTGG 
GCTGAGACAG 
CTTCATAGCA 
GCTGTCCTCT 
CTCAAAGACC 
TGCGCGTTAC 
ATTCCCTCTT 
TAATAGAACC 
ATCATTTTTG 
ATAATAAAAA 



31 
I 

GTAGATAAAG 
GAGACGCTGA 
CTATTAATGA 
CACGAAGTGA 
CTCTTGAGCA 
TGTATTGTGA 
ATCTGTATGG 
GGACCTCCAC 
ACCAGCCCAT 
ATATAAATGA 
TTCCCAATGT 
GGGCAAGGCT 
ACTGAAGAAC 
CAAGCTGGTG 
AAACACTGAG 
CCACGATGGC 
GGGATGATAT 
TTCTTAGCAT 
TCCTAATTGT 
AAAAAAAAAA 



41 

I 

ACCCTTTCTT 
TGGTGGAGGA 
TTTGAATCAG 
CAGTGTGACC 
AGGCAGAGGG 
GAAGGTTGGA 
CCAACCCGAG 
CCTTGAGTCT 
CATTCTGACT 
CTGAACTCAG 
GTTTTCGTCT 
GCTGTTATCA 
AGGATGTGGC 
CTGTGTAGGC 
CTTTCTTCTA 
ATGACTAGCA 
CATCCAGTCT 
TAAGACCTTG 
AATGTGTAAT 
AAA 



51 

1 

GCCAGGTGCT 
AGGGCCGTCT 
CAAGTGTGGA 
CCAGTCACTG 
GATCCCATTT 
GAACAGCCCA 
CCCGTGAAAC 
GTGGCCTTCC 
TCAGAACTTG 
CCTAGAGGTG 
ACATTTTCTT 
TCTCATTTTA 
CTCAGAAGCA 
CACAAGGCAT 
GGGGTGGGTA 
CAGAGCTGAT 
TTATATGTTG 
TAAACAAAAA 
CTTAAAGTTA 



Seq ID NO: 55 Protein sequence: 
Protein Accession ft: NP 062564 



51 



1 11 21 31 41 

I I I I I 1 

MRGTPGDADG GGRAVYQSMC KPITGTINDL NQQVWTLQGQ NLVAVPRSDS VTPVTVAVIT 
CKYPEALEQG RGDPIVLGIQ NPEMCLYCEK VGEQPTLQIiK EQKIMDLYGQ PEPVKPFLFY 
RAKTGRTSTL ESVAFPDWFI ASSKRDQPII LTSELGKSYN TAFELNIND 

Seq ID NO: 56 DNA sequence 

Nucleic Acid Accession 8: NM_003125 

Coding sequence: 65-334 



1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 



60 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 



60 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



60 
120 
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1 IX 21 31 41 51 

I I I I I I 

AGCAGTTCTA AGGGACCATA CAGAGTATTC CTCTCTTCAC ACCAGGACCA GCCACTGTTG 60 

CAGCATGAGT TCCCAGCAGC AGAAGCAGCC CTGCATCCCA CCCCCTCAGC TTCAGCAGCA 120 

GCAGGTGAAA CAGCCTTGCC AGCCTCCACC TCAGGAACCA TGCATCCCCA AAACCAAGGA 180 
GCCCTGCCAC CCCAAGGTGC CTGAGCCCTG CCACCCCAAA GTGCCTGAGC CCTGCCAGCC " 240 

CAAGCTTCCA GAGCCATGCC ACCCCAAGGT GCCTGAGCCC TGCCCTTCAA TAGTCACTCC 300 

AGCACCAGCC CAGCAGAAGA CCAAGCAGAA GTAATGTGGT CCACAGCCAT GCCCTTGAGG 360 

AGCOGGCCAC CAGATGCTGA ATCCCCTATC CCATTCTGTG TATGAGTCCC ATTTGCCTTG 420 

CAATTAGCAT TCTGTCTCCC CCAAAAAAGA ATGTGCTATG AAGCTTTCTT TCCTACACAC 480 

TCTGAGTCTC TGAATGAAGC TGAAGGTCTT AGTACCAGAG CTAGTTTTCA GCTGCTCAGA 540 

ATTCATCTGA AGAGAGACTT AAGATGAAAG CAAATGATTC AGCTCCCTTA TACCCCCATT 600 
AAATTCACTT TCAATTCCA 



Seq ID NO: 57 Protein sequence: 
Protein Accession #: NPJ>03116 

1 11 21 31 41 51 

I I I 1 I I 

MSSQQQKQPC IPPPQLQQQQ VKQPCQPPPQ EPCIPKTKEP CHPKVPEPCH PKVPEPCQPK 60 
LPEPCHPKVP EPCPSIVTPA PAQQKTKQK 

Seq ID NO: 58 DNA sequence 

Nucleic Acid Accession &: NMJ)01793.2 

Coding sequence: 71-2560 

1 11 21 31 41 51 

1 I I I I I 

AAAGGGGCAA GAGCTGAGCG GAACACCGGC CCGCCGTCGC GGCAGCTGCT TCACCCCTCT 60 

CTCTGCAGCC ATGGGGCTCC CTCGTGGACC TCTCGCGTCT CTCCTCCTTC TCCAGGTTTG 120 

CTGGCTGCAG TGCGCGGCCT CCGAGCCGTG CCGGGCGGTC TTCAGGGAGG CTGAAGTGAC 180 

CTTGGAGGCG GGAGGCX3CGG AGCAGGAGCC CGGCCAGGCG CTGGGGAAAG TATTCATGGG 240 

CTGCCCTGGG CAAGAGCCAG CTCTGTTTAG CACTGATAAT GATGACTTCA CTGTGCGGAA 300 

TGGCGAGACA GTCCAGGAAA GAAGGTCACT GAAGGAAAGG AATCCATTGA AGATCTTCCC 360 

ATCCAAACGT ATCTTACGAA GACACAAGAG AGATTGGGTG GTTGCTCCAA TATCTGTCCC 420 

TGAAAATGGC AAGGGTCCCT TCCCCCAGAG ACTGAATCAG CTCAAGTCTA ATAAAGATAG 480 

AGACACCAAG ATTTTCTACA GCATCACGGG GCCGGGGGCA GACAGCCCCC CTGAGGGTGT 540 

CTTCGCTGTA GAGAAGGAGA CAGGCTGGTT GTTGTTGAAT AAGCCACTGG ACCGGGAGGA 600 

GATTGCCAAG TATGAGCTCT TTGGCCACGC TGTGTCAGAG AATGGTGCCT CAGTGGAGGA 660 

CCCCATGAAC ATCTCCATCA TCGTGACCGA CCAGAATGAC CACAAGCCCA AGTTTACCCA 720 

GGACACCTTC CGAGGGAGTG TCTTAGAGGG AGTCCTACCA GGTACTTCTG TGATGCAGGT 780 

GACAGCCACG GATGAGGATG ATGCCATCTA CACCTACAAT GGGGTGGTTG CTTACTCCAT 840 

CCATAGCCAA GAACCAAAGG ACCCACACGA CCTCATGTTC ACCATTCACC GGAGCACAGG 900 

CACCATCAGC GTCATCTCCA GTGGCCTGGA CCGGGAAAAA GTCCCTGAGT ACACACTGAC 960 

CATCCAGGCC ACAGACATGG ATGGGGACGG CTCCACCACC ACGGCAGTGG CAGTAGTGGA 1020 

GATCCTTGAT GCCAATGACA ATGCTCCCAT GTTTGACCCC CAGAAGTACG AGGCCCATGT 1080 

GCCTGAGAAT GCAGTGGGCC ATGAGGTGCA GAGGCTGACG GTCACTGATC TGGACGCCCC 1140 

CAACTCACCA GCGTGGCGTG CCACCTACCT TATCATGGGC GGTGACGACG GGGACCATTT 1200 

TACCATCACC ACCCACCCTG AGAGCAACCA GGGCATCCTG ACAACCAGGA AGGGTTTGGA 1260 

TTTTGAGGCC AAAAACCAGC ACACCCTGTA CGTTGAAGTG ACCAACGAGG CCCCTTTTGT 1320 

GCTGAAGCTC CCAACCTCCA CAGCCACCAT AGTGGTCCAC GTGGAGGATG TGAATGAGGC 1380 

ACCTGTGTTT GTCCCACCCT CCAAAGTCGT TGAGGTCCAG GAGGGCATCC CCACTGGGGA 1440 

GCCTGTGTGT GTCTACACTG CAGAAGACCC TGACAAGGAG AATCAAAAGA TCAGCTACCG 1500 

CATCCTGAGA GACCCAGCAG GGTGGCTAGC CATGGACCCA GACAGTGGGC AGGTCACAGC 1560 

TGTGGGCACC CTCGACCGTG AGGATGAGCA GTTTGTGAGG AACAACATCT ATGAAGTCAT 1620 

GGTCTTGGCC ATGGACAATG GAAGCCCTCC CACCACTGGC ACGGGAACCC TTCTGCTAAC 16 B0 

ACTGATTGAT GTCAATGACC ATGGCCCAGT CCCTGAGCCC CGTCAGATCA CCATCTGCAA 1740 

CCAAAGCCCT GTGCGCCAGG TGCTGAACAT CACGGACAAG GACCTGTCTC CCCACACCTC 1800 

CCCTTTCCAG GCCCAGCTCA CAGATGACTC AGACATCTAC TGGACGGCAG AGGTCAACGA 1860 

GGAAGGTGAC ACAGTGGTCT TGTCCCTGAA GAAGTTCCTG AAGCAGGATA CATATGACGT 1920 

GCACCTTTCT CTGTCTGACC ATGGCAACAA AGAGCAGCTG ACGGTGATCA GGGCCACTGT 1980 

GTGCGACTGC CATGGCCATG TCGAAACCTG CCCTGGACCC TGGAAGGGAG GTTTCATCCT 2040 

CCCTGTGCTG GGGGCTGTCC TGGCTCTGCT GTTCCTCCTG CTGGTGCTGC TTTTGTTGGT 2100 

GAGAAAGAAG CGGAAGATCA AGGAGCCCCT CCTACTCCCA GAAGATGACA CCCGTGACAA 2160 

CGTCTTCTAC TATGGCGAAG AGGGGGGTGG CGAAGAGGAC CAGGACTATG ACATCACCCA 2220 

GCTCCACCGA GGTCTGGAGG CCAGGCCGGA GGTGGTTCTC CGCAATGACG TGGCACCAAC 2280 

CATCATCCCG ACACCCATGT ACCGTCCTCG GCCAGCCAAC CCAGATGAAA TCGGCAACTT 2340 

TATAATTGAG AACCTGAAGG CGGCTAACAC AGACCCCACA GCCCCGCCCT ACGACACCCT 2400 

CTTGGTGTTC GACTATGAGG GCAGCGGCTC CGACGCCGCG TCCCTGAGCT CCCTCACCTC 2460 

CTCCGCCTCC GACCAAGACC AAGATTACGA TTATCTGAAC GAGTGGGGCA GCCGCTTCAA 2520 

GAAGCTGGCA GACATGTACG GTGGCGGGGA GGACGACTAG GCGGCCTGCC TGCAGGGCTG 2580 

GGGACCAAAC GTCAGGCCAC AGAGCATCTC CAAGGGGTCT CAGTTCCCCC TTCAGCTGAG 2640 

GACTTCGGAG CTTGTCAGGA AGTGGCCGTA GCAACTTGGC GGAGACAGGC TATGAGTCTG 2700 

ACGTTAGAGT GGTTGCTTCC TTAGCCTTTC AGGATGGAGG AATGTGGGCA GTTTGACTTC 2760 

AGCACTGAAA ACCTCTCCAC CTGGGCCAGG GTTGCCTCAG AGGCCAAGTT TCCAGAAGCC 2820 

TCTTACCTGC CGTAAAATGC TCAACCCTGT GTCCTGGGCC TGGGCCTGCT GTGACTGACC 2880 

TACAGTGGAC TTTCTCTCTG GAATGGAACC TTCTTAGGCC TCCTGGTGCA ACTTAATTTT 2940 

TTTTTTTAAT GCTATCTTCA AAAOGTTAGA GAAAGTTCTT CAAAAGTGCA GCCCAGAGCT 3000 

GCTGGGCCCA CTGGCCGTCC TGCATTTCTG GTTTCCAGAC CCCAATGCCT CC CATT CGGA 3060 

TGGATCTCTG CGTTTTTATA CTGAGTGTGC CTAGGTTGCC CCTTATTTTT TATTTTCCCT 3120 

GTTGCGTTGC TATAGATGAA GGGTGAGGAC AATCGTGTAT ATGTACTAGA ACTTTTTTAT 3180 
TAAAGAAACT TTTCCCAGAA AAAAA 

Seq ID NO: 59 Protein sequence: 



209 



WO 02/086443 PCTYUS02/12476 

Protein Accession fl: NP_001784.2 

1 11 21 31 41 51 

-I I I I I I 

J MGLPRGPLAS LLLLQVCWLQ CAASEPCRAV FREAEVTLEA GGAEQEPGQA LGKVFMGCPG 60 

QEPALPSTDN DDFTVRNGET VQERRSLKER NPLKIFPSKR ILRRHKRDWV VAPISVPENG 120 

KGPFPQRLNQ LKSNKDRDTK IFYSITGPGA DSPPEGVFAV EKETOWLLLN KFLDREEIAK 180 

YELFGHAVSE NGASVEDPMN ISIIVTDQND HKPKFTQDTF RGSVLEGVLP GTSVMQVTAT 240 

DEDDAIYTYN GWAYSIHSQ EPKDPHDLMF TIHRSTGTIS VISSGLDREK VPEYTLTIQA 300 

10 TDMDGDGSTT TAVAWEIU) ANDNAPMFDP QKYEAHVPEN AVGHEVQRLT VTDLDAPNSP 360 

AWRATYLIMG GDDGDHFTIT THPESNQGIL TTRKGLDFEA KNQHTLYVEV TNEAPFVLKL 420 

PTSTATIWH VEDVNEAPVF VPPSKWEVQ EGIPTGEPVC VYTAEDPDKE NQKISYRILR 480 

DPAGWLAMDP DSGQVTAVGT LDREDEQFVR NNIYEVMVLA MDNGSPPTTG TGTLLLTLID 540 

VNDHGPVPEP RQITICNQSP VRQVLNITDK DLSPHTSPFQ AQLTDDSDIY WTAEVNEEGD 600 

15 TWLSLKKFL KQDTYDVHLS LSDHGNKEQL TVIRATVCDC HGHVETCPGP WKGGFILPVL 660 

GAVLALLFLL LVLLLLVRKK RKIKEPLLLP EDDTRDNVFY YGEEGGGEED QDYDITQLHR 720 

GLEARPEWL RNDVAPTIIP TPMYRPRPAN PDEIGNFIIE NLKAANTDPT APPYDTIiLVF 7 BO 

DYEGSGSDAA SLSSLTSSAS DQDQDYDYLN EWGSRFKKLA DMYGGGEDD 

20 Seq ID NO: 60 DNA sequence 

Nucleic Acid Accession #: Eos sequence 
Coding sequence: 162-428 

1 11 21 31 41 51 

25 ! | | | I I 

GCGTTCCGTT GGCGGCGGAT TCGAACGTTC GGACTGAGGT TTTTCTGCCT GAAGAAGCGT 60 
CATACGGACC GGATTGTTTT CGCTGGCCCA GTGTCCCCGG AGCTTGTGTG CGATACAGAG 120 
AGCACCTCGG AAGCTGAGGC AGCTGGTACT TGACAGAGAG GATGGCGCTG TCGACCATAG 180 
TCTCCCAGAG GAAGCAGATA AAGCGGAAGG CTCCCCGTGG CTTTCTAAAG CGAGTCTTCA 240 
30 AGCGAAAGAA GCCTCAACTT CGTCTGGAGA AAAGTGGTGA CTTATTGGTC CATCTGAACT 300 
GTTTACTGTT TGTTCATCGA TTAGCAGAAG AGTCCAGGAC AAACGCTTGT GCGAGTAAAT 360 
GTAGAGTCAT TAACAAGGAG CATGTACTGG CCGCAGCAAA GGTAATTCTA AAGAAGAGCA 420 
GAGGTTAGAA GTCAAAGAAC ATATTCTTGA AAGTTATGAT GCATTCTTTT GGGTGGTAAC 480 
AGATCATAAA GACATTTTTT ACACATCAGT TAATATGGGA TTATTAAATA TTGG 



35 



Seq ID NO: 61 Protein sequence: 
Protein Accession #: Eos sequence 



1 11 21 31 41 51 

40 | | | | | I 

MALSTIVSQR KQIKRKAPRG FLKRVFKRKK PQLRLEKSGD LLVHLNCLLF VHRLAEESRT 60 
NACASKCRVI NKEHVLAAAK VILKKSRG 



Seq ID NO: 62 DNA sequence 
45 Nucleic Acid Accession #: NM_000094.2 
Coding sequence: 99-8933 

1 11 21 31 41 51 

c n I I I I I I 

jU GGGCTGGAGG GGCGCTGGGC TCGGACCTGC CAAGGCCACC GCAGGGGGGA GCAAGGGACA- 60 

GAGGCGGGGG TCCTAGCTGA CGGCTTTTAC TGCCTAGGAT GACGCTGCGG CTTCTGGTGG 120 

CCGCGCTCTG CGCCGGGATC CTGGCAGAGG CGCCCCGAGT GCGAGCCCAG CACAGGGAGA 180 

GAGTGACCTG CACGCGCCTT TACGCCGCTG ACATTGTGTT CTTACTGGAT GGCTCCTCAT 240 

CCATTGGCCG CAGCAATTTC CGCGAGGTCC GCAGCTTTCT CGAAGGGCTG GTGCTGCCTT 300 

55 TCTCTGGAGC AGCCAGTGCA CAGGGTGTGC GCTTTGCCAC AGTGCAGTAC AGCGATGACC 360 

CACGGACAGA GTTCGGCCTG GATGCACTTG GCTCTGGGGG TGATGTGATC CGCGCCATCC 420 

GTGAGCTTAG CTACAAGGGG GGCAACACTC GCACAGGGGC TGCAATTCTC CATGTGGCTG 480 

ACCATGTCTT CCTGCCCCAG CTGGCCCGAC CTGGTGTCCC CAAGGTCTGC ATCCTGATCA 540 

- CAGACGGGAA GTCCCAGGAC CTGGTGGACA CAGCTGCCCA AAGGCTGAAG GGGCAGGGGG 600 

60 TCAAGCTATT TGCTGTGGGG ATCAAGAATG CTGACCCTGA GGAGCTGAAG CGAGTTGCCT 660 

CACAGCCAAC CTCCGACTTC TTCTTCTTCG TCAATGACTT CAGCATCTTG AGGACACTAC 720 

TGCCCCTCGT TTCCCGGAGA GTGTGCACGA CTGCTGGTGG CGTGCCTGTG ACCCGACCTC 780 

CGGATGACTC GACCTCTGCT CCACGAGACC TGGTGCTGTC TGAGCCAAGC AGCCAATCCT 840 

TGAGAGTACA GTGGACAGCG GCCAGTGGCC CTGTGACTGG CTACAAGGTC CAGTACACTC 900 

65 CTCTGACGGG GCTGGGACAG CCACTGCCGA GTGAGCGGCA GGAGGTGAAC GTCCCAGCTG 960 

t GTGAGACCAG TGTGCGGCTG CGGGGTCTCC GGCCACTGAC CGAGTACCAA GTGACTGTGA 1020 

' TTGCCCTCTA CGCCAACAGC ATCGGGGAGG CTGTGAGCGG . GACAGCTCGG ACCACTGCCC 1080 

TAGAAGGGCC GGAACTGACC ATCCAGAATA CCACAGCCCA CAGCCTCCTG GTGGCCTGGC 1140 

n GGAGTGTGCC AGGTGCCACT GGCTACCGTG TGACATGGCG GGTCCTCAGT GGTGGGCCCA 1200 

70 CACAGCAGCA GGAGCTGGGC CCTGGGCAGG GTTCAGTGTT GCTGCGTGAC TTGGAGCCTG 1260 

GCACGGACTA TGAGGTGACC GTGAGCACCC TATTTGGCCG CAGTGTGGGG CCCGCCACTT 1320 

CCCTGATGGC TCGCACTGAC GCTTCTGTTG AGCAGACCCT GCGCCCGGTC ATCCTGGGCC 1380 

CCACATCCAT CCTCCTTTCC TGGAACTTGG TGCCTGAGGC CCGTGGCTAC CGGTTGGAAT 1440 

nc GGCGGCGTGA GACTGGCTTG GAGCCACCGC AGAAGGTGGT ACTGCCCTCT GATGTGACCC 1500 

75 GCTACCAGTT GGATGGGCTG CAGCCGGGCA CTGAGTACCG CCTCACACTC TACACTCTGC 1560 

TGGAGGGCCA CGAGGTGGCC ACCCCTGCAA CCGTGGTTCC CACTGGACCA GAGCTGCCTG 1620 

TGAGCCCTGT AACAGACCTG CAAGCCACCG AGCTGCCCGG GCAGCGGGTG CGAGTGTCCT 1680 

GGAGCCCAGT CCCTGGTGCC ACCCAGTACC GCATCATTGT GCGCAGCACC CAGGGGGTTG 1740 

AGCGGACCCT GGTGCTTCCT GGGAGTCAGA CAGCATTCGA CTTGGATGAC GTTCAGGCTG 1800 

80 GGCTTAGCTA CACTGTGCGG GTGTCTGCTC GAGTGGGTCC CCGTGAGGGC AGTGCCAGTG 1860 

TCCTCACTGT CCGCCGGGAG CCGGAAACTC CACTTGCTGT TCCAGGGCTG CGGGTTGTGG 1920 

TGTCAGATGC AACGCGAGTG AGGGTGGCCT GGGGACCCGT CCCTGGAGCC AGTGGATTTC 1980 

GGATTAGCTG GAGCACAGGC AGTGGTCCGG AGTCCAGCCA GACACTGCCC CCAGACTCTA 2040 

CTGCCACAGA CATCACAGGG CTGCAGCCTG GAACCACCTA CCAGGTGGCT GTGTCGGTAC 2100 

85 TGCGAGGCAG AGAGGAGGGC CCTGCTGCAG TCATCGTGGC TCGAACGGAC CCACTGGGCC 2160 

" CAGTGAGGAC GGTCCATGTG ACTCAGGCCA GCAGCTCATC TGTCACCATT ACCTGGACCA 2220 

GGGTTCCTGG CGCCACAGGA TACAGGGTTT CCTGGCACTC AGCCCACGGC CCAGAGAAAT 2280 



210 



WO 02/086443 

CCCAGTTGGT TTCTGGGGAG GCCACGGTGQ CTGAGCTGGA TGGACTGGAG CCAGATACTG 2340 

AGTATACGGT GCATGTGAGG GCCCATGTGG CTGGCGTGGA TGGGCCCCCT GCCTCTGTGG 2400 

TTGTGAGGAC TGCCCCTGAG CCTGTGGGTC GTGTGTCGAG GCTGCAGATC CTCAATGCTT 2460 

CCAGCGACGT TCTACGGATC ACCTGGGTAG GGGTCACTGG AGCCACAGCT TACAGACTGG 2520 

CCTGGGGCOG GAGTGAAGGC GGCCCCATGA GGCACCAGAT ACTCCCAGGA AACACAGACT 2580 

CTGCAGAGAT CCGGGGTCTC GAAGGTGGAG TCAGCTACTC AGTGCGAGTG ACTGCACTTG 2640 

TCGGGGACCG CGAGGGCACA CCTGTCTCCA TTGTTGTCAC TACGCCGCCT GAGGCTCCGC 2700 

CAGCCCTGGG GACGCTTCAC GTGGTGCAGC GCGGGGAGCA CTCGCTGAGG CTGCGCTGGG 2760 

AGCCGGTGCC CAGAGCGCAG GGCTTCCTTC TGCACTGGCA ACCTGAGGGT GGCCAGGAAC 2820 

AGTCCCGGGT CCTGGGGCCC GAGCTCAGCA GCTATCACCT GGACGGGCTG GAGCCAGCGA 2880 

CACAGTACCG CGTGAGGCTG AGTGTCCTAG GGCCGGCTGG AGAAGGGCCC TCTGCAGAGG 2940 

TGACTGCGCG CACTGAGTCA CCTCGTGTTC CAAGCATTGA ACTACGTGTG GTGGACACCT 3000 

CGATCGACTC GGTGACTTTG GCCTGGACTC CAGTGTCCAG GGCATCCAGC TACATCCTAT 3060 

CCTGGCGGCC ACTCAGAGGC CCTGGCCAGG AAGTGCCTGG GTCCCCGCAG ACACTTCCAG 3120 

GGATCTCAAG CTCCCAGCGG GTGACAGGGC TAGAGCCTGG CGTCTCTTAC ATCTTCTCCC 3180 

TGACGCCTGT CCTGGATGGT GTGCGGGGTC CTGAGGCATC TGTCACACAG ACGCCAGTGT 3240 

GCCCCCGTGG CCTGGCGGAT GTGGTGTTCC TACCACATGC CACTCAAGAC AATGCTCACC 3300 

GTGCGGAGGC TACGAGGAGG GTCCTGGAGC GTCTGGTGTT GGCACTTGGG CCTCTTGGGC 3360 

CACAGGCAGT TCAGGTTGGC CTGCTGTCTT ACAGTCATCG GCCCTCCCCA CTGTTCCCAC 3420 

TGAATGGCTC CCATGACCTT GGCATTATCT TGCAAAGGAT CCGTGACATG CCCTACATGG 3480 

ACCCAAGTGG GAACAACCTG GGCACAGCCG TGGTCACAGC TCACAGATAC ATGTTGGCAC 3540 

CAGATGCTCC TGGGCGCCGC CAGCACGTAC CAGGGGTGAT GGTTCTGCTA GTGGATGAAC 3600 

CCTTGAGAGG TGACATATTC AGCCCCATCC GTGAGGCCCA GGCTTCTGGG CTTAATGTGG 3660 

TGATGTTGGG AATGGCTGGA GCGGACCCAG AGCAGCTGCG TCGCTTGGCG CCGGGTATGG 3720 

ACTCTGTCCA GACCTTCTTC GCCGTGGATG ATGGGCCAAG CCTGGACCAG GCAGTCAGTG 3780 

GTCTGGCCAC AGCCCTGTGT CAGGCATCCT TCACTACTCA GCCCCGGCCA GAGCCCTGCC 3840 

CAGTGTATTG TCCAAAGGGC CAGAAGGGGG AACCTGGAGA GATGGGCCTG AGAGGACAAG 3900 

TTGGGCCTCC TGGCGACCCT GGCCTCCCGG GCAGGACCGG TGCTCCCGGC CCCCAGGGGC 3960 

CCCCTGGAAG TGCCACTGCC AAGGGCQAGA GGGGCTTCCC TGGAGCAGAT GGGCGTCCAG 4020 

GCAGCCCTGG CCGCGCCGGG AATCCTGGGA CCCCTGGAGC CCCTGGCCTA AAGGGCTCTC 4080 

CAGGGTTGCC TGGCCCTCGT QGGGACCCGG GAGAGCGAGG ACCTCGAGGC CCAAAGGGGG 4140 

AGCCGGGGGC TCCCGGACAA GTCATCGGAG GTGAAGGACC TGGGCTTCCT GGGCGGAAAG 4200 

GGGACCCTGG ACCATOGGGC CCCCCTGGAC CTCGTGGACC ACTGGGGGAC CCAGGACCCC 4260 

GTGGCCCCCC AGGGCTTCCT GGAACAGCCA TGAAGGGTGA CAAAGGCGAT CGTGGGGAGC 4320 

GGGGTCCCCC TGGACCAGGT GAAGGTGGCA TTGCTCCTGG GGAGCCTGGG CTGCCGGGTC 4380 

TTCCCGGAAG CCCTGGACCC CAAGGCCCCG TTGGCCCCCC TGGAAAGAAA GGAGAAAAAG 4440 

GTGACTCTGA GGATGGAGCT CCAGGCCTCC CAGGACAACC TGGGTCTCCG GGTGAGCAGG 4500 

GCCCACGGGG ACCTCCTGGA GCTATTGGCC CCAAAGGTGA CCGGGGCTTT CCAGGGCCCC 4560 

TGGGTGAGGC TGGAGAGAAG GGCGAACGTG GACCCCCAGG CCCAGCGGGA TCCCGGGGGC 4620 

TGCCAGGGGT TGCTGGACGT CCTGGAGCCA AGGGTCCTGA AGGGCCACCA GGACCCACTG 4680 

GCCGCCAAGG AGAGAAGGGG GAGCCTGGTC GCCCTGGGGA CCCTGCAGTG GTGGGACCTG 4740 

CTGTTGCTGG ACCCAAAGGA GAAAAGGGAG ATGTGGGGCC CGCTGGGCCC AGAGGAGCTA 4800 

CCGGAGTCCA AGGGGAACGG GGCCCACCCG GCTTGGTTCT TCCTGGAGAC CCTGGCCCCA 4860 

AGGGAGACCC TGGAGACCGG GGTCCCATTG GCCTTACTGG CAGAGCAGGA CCCCCAGGTG 4920 

ACTCAGGGCC TCCTGGAGAG AAGGGAGACC CTGGGCGGCC TGGCCCCCCA GGACCTGTTG 4980 

GCCCCCGAGG ACGAGATGGT GAAGTTGGAG AGAAAGGTGA CGAGGGTCCT CCGGGTGACC 5040 

CGGGTTTGCC TGGAAAAGCA GGCGAGCGTG GCCTTCGGGG GGCACCTGGA GTTCGGGGGC 5100 

CTGTGGGTGA AAAGGGAGAC CAGGGAGATC CTGGAGAGGA TGGACGAAAT GGCAGCCCTG 5160 

GATCATCTGG ACCCAAGGGT GACCGTGGGG AGCCGGGTCC CCCAGGACCC CCGGGACGGC 5220 

TGGTAGACAC AGGACCTGGA GCCAGAGAGA AGGGAGAGCC TGGGGACCGC GGACAAGAGG 5280 

GTCCTCGAGG GCCCAAGGGT GATCCTGGCC TCCCTGGAGC CCCTGGGGAA AGGGGCATTG 5340 

AAGGGTTTCG GGGACCCCCA GGCCCACAGG GGGACCCAGG TGTCCGAGGC CCAGCAGGAG 5400 

AAAAGGGTGA CCGGGGTCCC CCTGGGCTGG ATGGCCGGAG CGGACTGGAT GGGAAACCAG 5460 

GAGCCGCTGG GCCCTCTGGG CCGAATGGTG CTGCAGGCAA AGCTGGGGAC CCAGGGAGAG 5520 

ACGGGCTTCC AGGCCTCCGT GGAGAACAAG GCCTCCCTGG CCCCTCTGGT CCCCCTGGAT 5580 

TACCGGGAAA GCCAGGCGAG GATGGGAAAC CTGGCCTGAA TGGAAAAAAC GGAGAACCTG 5640 

GGGACCCTGG AGAAGACGGG AGGAAGGGAG AGAAAGGAGA TTCAGGCGCC TCTGGGAGAG 5700 

AAGGTCGTGA TGGCCCCAAG GGTGAGCGTG GAGCTCCTGG TATCCTTGGA CCCCAGGGGC 5760 

CTCCAGGCCT CCCAGGGCCA GTGGGCCCTC CTGGCCAGGG TTTTCCTGGT GTCCCAGGAG 5820 

GCACGGGCCC CAAGGGTGAC CGTGGGGAGA CTGGATCCAA AGGGGAGCAG GGCCTCCCTG 5880 

GAGAGCGTGG CCTGCGAGGA GAGCCTGGAA GTGTGCCGAA TGTGGATCGG TTGCTGGAAA 5940 

CTGCTGGCAT CAAGGCATCT GCCCTGCGGG AGATCGTGGA GACCTGGGAT GAGAGCTCTG 6000 

GTAGCTTCCT GCCTGTGCCC GAACGGCGTC GAGGCCCCAA GGGGGACTCA GGCGAACAGG 6060 

GCCCCCCAGG CAAGGAGGGC CCCATCGGCT TTCCTGGAGA ACGCGGGCTG AAGGGCGACC 6120 

GTGGAGACCC TGGCCCTCAG GGGCCACCTG GTCTGGCCCT TGGGGAGAGG GGCCCCCCCG 61 BO 

GGCCTTCCGG CCTTGCCGGG GAGCCTGGAA AGCCTGGTAT TCCCGGGCTC CCAGGCAGGG 6240 

CTGGGGGTGT GGGAGAGGCA GGAAGGCCAG GAGAGAGGGG AGAACGGGGA GAGAAAGGAG 6300 

AACGTGGAGA ACAGGGCAGA GATGGCCCTC CTGGACTCCC TGGAACCCCT GGGCCCCCCG 6360 

GACCCCCTGG CCCCAAGGTG TCTGTGGATG AGCCAGGTCC TGGACTCTCT GGAGAACAGG 6420 

GACCCCCTGG ACTCAAGGGT GCTAAGGGGG AGCCGGGCAG CAATGGTGAC CAAGGTCCCA 6480 

AAGGAGACAG GGGTGTGCCA GGCATCAAAG GAGACCGGGG AGAGCCTGGA CCGAGGGGTC 6540 

AGGACGGCAA CCCGGGTCTA CCAGGAGAGC GTGGTATGGC TGGGCCTGAA GGGAAGCCGG 6600 

GTCTGCAGGG TCCAAGAGGC CCCCCTGGCC CAGTGGGTGG TCATGGAGAC CCTGGACCAC 6660 

CTGGTGCCCC GGGTCTTGCT GGCCCTGCAG GACCCCAAGG ACCTTCTGGC CTGAAGGGGG 6720 

AGCCTGGAGA GACAGGACCT CCAGGACGGG GCCTGACTGG ACCTACTGGA GCTGTGGGAC 6780 

TTCCTGGACC CCCCGGCCCT TCAGGCCTTG TGGGTCCACA GGGGTCTCCA GGTTTGCCTG 6840 

GACAAGTGGG GGAGACAGGG AAGCCGGGAG CCCCAGGTCG AGATGGTGCC AGTGGAAAAG 6900 

ATGGAGACAG AGGGAGCCCT GGTGTGCCAG GGTCACCAGG TCTGCCTGGC CCTGTCGGAC 6960 

CTAAAGGAGA ACCTGGCCCC ACGGGGGCCC CTGGACAGGC TGTGGTCGGG CTCCCTGGAG 7020 

CAAAGGGAGA GAAGGGAGCC CCTGGAGGCC TTGCTGGAGA CCTGGTGGGT GAGCCGGGAG 7080 

CCAAAGGTGA CCGAGGACTG CCAGGGCCGC GAGGCGAGAA GGGTGAAGCT GGCCGTGCAG 7140 

GGGAGCCCGG AGACCCTGGG GAAGATGGTC AGAAAGGGGC TCCAGGACCC AAAGGTTTCA 7200 

AGGGTGACCC AGGAGTCGGG GTCCCGGGCT CCCCTGGGCC TCCTGGCCCT CCAGGTGTGA 7260 

AGGGAGATCT GGGCCTCCCT GGCCTGCCCG GTGCTCCTGG TGTTGTTGGG TTCCCGGGTC 7320 

AGACAGGCCC TCGAGGAGAG ATGGGTCAGC CAGGCCCTAG TGGAGAGCGG GGTCTGGCAG 7380 

GCCCCCCAGG GAGAGAAGGA ATCCCAGGAC CCCTGGGGCC ACCTGGACCA CCGGGGTCAG 7440 

TGGGACCACC TGGGGCCTCT GGACTCAAAG GAGACAAGGG AGACCCTGGA GTAGGGCTGC 7500 
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CTGGGCCCCG AGGCGAGCGT GGGGAGCCAG GCATCCGGGG TGAAGATGGC CGCCCCGGCC 7560 

AGGAGGGACC CCGAGGACTC ACGGGGCCCC CTGGCAGCAG GGGAGAGOGT GGGGAGAAGG 7620 

GTGATGTTGG GAGTGCAGGA CTAAAGGGTG ACAAGGGAGA CTCAGCTGTG ATCCTGGGGC 7680 

CTCCAGGCCC ACGGGGTGCC AAGGGGGACA TGGGTGAACG AGGGCCTCGG GGCTTGGATG 7740 

GTGACAAAGG ACCTCGGGGA GACAATGGGG ACCCTGGTGA CAAGGGCAGC AAGGGAGAGC 7800 

CTGGTGACAA GGGCTCAGCC GGGTTGCCAG GACTGCGTGG ACTCCTGGGA CCCCAGGGTC 7860 

AACCTGGTGC AGCAGGGATC CCTGGTGACC CGGGATCCCC AGGAAAGGAT GGAGTGCCTG 7920 

GTATCCGAGG AGAAAAAGGA GATGTTGGCT TCATGGGTCC CCGGGGCCTC AAGGGTGAAC 7980 

GGGGAGTGAA GGGAGCCTGT GGCCTTGATG GAGAGAAGGG AGACAAGGGA GAAGCTGGTC 8040 

CCCCAGGCCG CCCCGGGCTG GCAGGACACA AAGGAGAGAT GGGGGAGCCT GGTGTGCCGG 8100 

GCCAGTCGGQ GGCCCCTGGC AAGGAGGGCC TGATCGGTCC CAAGGGTGAC CGAGGCTTTG 8160 

ACGGGCAGCC AGGCCCCAAG GGTGACCAGG GCGAGAAAGG GGAGOGGGGA ACCCCAGGAA 8220 

TTGGGGGCTT CCCAGGCCCC AGTGGAAATG ATGGCTCTGC TGGTCCCCCA GGGCCACCTG 8280 

GCAGTGTTGG TCCCAGAGGC CCCGAAGGAC TTCAGGGCCA GAAGGGTGAG CGAGGTCCCC 8340 

CCGGAGAGAG AGTGGTGGGG GCTCCTGGGG TCCCTGGAGC TCCTGGCGAG AGAGGGGAGC 8400 

AGGGGOGGCC AGGGCCTGOC GGTCCTCGAG GCGAGAAGGG AGAAGCTGCA CTGACGGAGG 8460 

ATGACATCCG GGGCTTTGTG CGCCAAGAGA TGAGTCAGCA CTGTGCCTGC CAGGGCCAGT 8520 

TCATCGCATC TGGATCACGA CCCCTCCCTA GTTATGCTGC AGACACTGCC GGCTCCCAGC 8580 

TCCATGCTGT GCCTGTGCTC CGCGTCTCTC ATGCAGAGGA GGAAGAGCGG GTACCCCCTG 8640 

AGGATGATGA GTACTCTGAA TACTCCGAGT ATTCTGTGGA GGAGTAOCAG GAOCCTGAAG 8700 

CTCCTTGGGA TAGTGATGAC CCCTGTTCCC TGCCACTGGA TGAGGGCTCC TGCACTGCCT 8760 

ACACCCTGCG CTGGTACCAT CGGGCTGTGA CAGGCAGCAC AGAGGCCTGT CACCCTTTTG 8820 

TCTATGGTGG CTGTGGAGGG AATGCCAACC GTTTTGGGAC CCGTGAGGCC TGCGAGCGCC 8880 

GCTGCCCACC CCGGGTGGTC CAGAGCCAGG GGACAGGTAC TGCCCAGGAC TGAGGCCCAG 8940 

ATAATGAGCT - GAG ATTCAGC ATCCCCTGGA GGAGTCGGGG TCTCAGCAGA ACCCCACTGT 9000 

CCCTCCCCTT GGTGCTAGAG GCTTGTGTGC ACGTGAGCGT GCGAGTGCAC GTCCGTTATT 9060 

TCAGTGACTT GGTCCCGTGG GTCTAGCCTT CCCCCCTGTG GACAAACCCC CATTGTGGCT 9120 

CCTGCCACCC TGGCAGATGA CTCACTGTGG GGGGGTGGCT GTGGGCAGTG AGCGGATGTG 9180 

ACTGGCGTCT GACCCGCCCC TTGACCCAAG CCTGTGATGA CATGGTGCTG ATTCTGGGGG 9240 
GCATTAAAGC TGCTGTTTTA AAAGGCAAAA AA 

Seq ID NO: 63 Protein sequence: 
Protein Accession #: NP_000085.1 

1 11 21 31 41 51 

I I I I I I 

MTLRLLVAAL CAGILAEAPR VRAQHRERVT CTRL YARD IV PLLDGSSSIG RSNFREVRSP 60 

LEGLVLPFSG AASAQGVRFA TVQYSDDPRT EFGLDALGSG GDVIRAIREL SYKGGNTRTG 120 

AAILHVADHV FLPQLARPGV PKVCILITDG KSQDLVDTAA QRLKGQGVKL FAVGIKNADP 180 

EELKRVASQP TSDFFFFVND FSILRTLLPL VSRRVCTTAG GVPVTRPPDD STSAPRDLVL 240 

SEPSSQSLRV QWTAASGPVT GYKVQYTPLT GLGQPLPSER QEVNVPAGET SVRLRGLRPL 30 0 

TEYQVTVIAL YANSIGEAVS GTARTTALEG PELTIQNTTA HSLLVAWRSV PGATGYRVTW 360 

RVLSGGPTQQ QELGPGQGSV LLRDLEPGTD' YEVTVSTLFG RSVGPATSLM ARTDASVEQT 420 

LRPVILGPTS ILLSWNLVPE ARGYRLEWRR ETGLEPPQKV VLPSDVTRYO LDGLQPGTEY 480 

RLTLYTLLEG HEVATPATW PTGPELPVSP VTDLQATELP GQRVRVSWSP VPGATQYRII 540 

VRSTQGVERT LVLPGSQTAF DLDDVQAGLS YTVRVSARVG PREGSASVLT VRREPETPLA 600 

VPGLRWVSD ATRVRVAWGP VPGASGFRIS WSTGSGPESS QTLPPDSTAT DITGLQPGTT 660 

YQVAVSVLRG REEGPAAVIV ARTDPLGPVR TVHVTQASSS SVTITWTRVP GATGYRVSWH 720 

SAHGPEKSQL VSGEATVAEL DGLEPDTEYT VHVRAHVAGV DGPPASWVR TAPEPVGRVS 780 

RLQILNASSD VLRITWVGVT GATAYRLAWG RSEGGPMRHQ ILPGNTDSAE IRGLEGGVSY 840 

SVRVTAL VGD REGTPVSIW TTPPEAPPAL GTLHWQRGE HSI/RLRWEPV PRAQGFLLHW 900 

QPEGGQEQSR VLGPELSSYH LDGLEPATQY RVRLSVLGPA GEGPSAEVTA RTESPRVPSI 960 

ELRWDTSID SVTLAWTPVS RASSYILSWR PLRGPGQEVP GSPQTLPGIS SSQRVTGLEP 1020 

GVSYIFSI/TP VLDGVRGPEA SVTQTPVCPR GLADWFLPH ATQDNAHRAE ATRRVLERLV 1080 

LALGPLGPQA VQVGLLSYSH RPSPLFPLNG SHDLGIILQR IRDMPYMDPS GNNLGTAWT 1140 

AHRYMLAPDA PGRRQHVPGV MVLLVDEPLR GDIFSPIREA QASGLNWML GMAGADPEQL 1200 

RRLAPGMDSV QTFFAVDDGP SLDQAVSGLA TALCQASFTT QPRPEPCPVY CPKGQKGEPG 1260 

EMGLRGQVGP PGDPGLPGRT GAPGPQGPPG SATAKGERGF PGADGRPGSP GRAGNPGTPG 1320 

APGLKGSPGL PGPRGDPGER GPRGPKGEPG APGQVIGGEG PGLPGRKGDP GPSGPPGPRG 1380 

PLGDPGPRGP PGIiPGTAMKG DKGDRGERGP PGPGEGGIAP GEPGLPGLPG SPGPQGPVGP 1440 

PGKKGEKGDS EDGAPGLPGQ PGSPGEQGPR GPPGAIGPKG DRGFPGPLGE AGEKGERGPP 1500 

GPAGSRGLPG VAGRPGAXGP EGPPGPTGRQ GEKGEPGRPG DPAWGPAVA GPKGEKGDVG 1560 

PAGPRGATGV QGERGPPGLV LPGDPGPKGD PGDRGPIGLT GRAGPPGDSG PPGEKGDPGR 1620 

PGPPGPVGPR GRDGEVGEKG DEGPPGDPGL PGKAGERGLR GAPGVRGPVG EKGDQGDPGE 1680 

DGRNGSPGSS GPKGDRGEPG PPGPPGRLVD TGPGAREKGE PGDRGQEGPR GPKGDPGLPG 1740 

APGERGIEGF RGPPGPQGDP GVRGPAGEKG DRGPPGLDGR SGLDGKPGAA GPSGPNGAAG 1800 

KAGDPGRDGL PGLRGEQGLP GPSGPPGLPG KPGEDGKPGL NGKNGEPGDP GEDGRKGEKG 1860 

DSGASGREGR DGPKGERGAP GILGPQGPPG LPGPVGPPGQ GFPGVPGGTG PKGDRGETGS 1920 

KGEQGLPGER GLRGEPGSVP NVDRLLETAG IKASALREIV ETWDESSGSF LPVPERRRGP 1980 

KGDSGEQGPP GKEGPIGFPG ERGLKGDRGD PGPQGPPGLA LGERGPPGPS GLAGEPGKPG 2040 

I PGLPGRAGG VGEAGRPGER GERGEKGERG EQGRDGPPGL PGTPGPPGPP GPKVSVDEPG 2100 

PGLSGEQGPP GLKGAKGEPG SNGDQGPKGD RGVPGIKGDR GEPGPRGQDG NPGLPGERGM 2160 

AGPEGKPGLQ GPRGPPGPVG GHGDPGPPGA PGLAGPAGPQ GPSGLKGEPG ETGPPGRGLT 2220 

GPTGAVGLPG PPGPSGLVGP QGSPGLPGQV GBTGKPGAPG RDGASGKDGD RGSPGVPGSP 2280 

GLPGPVGPKG EPGPTGAPGQ AWGLPGAKG EKGAPGGLAG DLVGEPGAKG DRGLPGPRGE 2340 

KGEAGRAGEP GDPGEDGQKG APGPKGFKGD PGVGVPGSPG PPGPPGVKGD LGLPGLPGAP 2400 

GWGFPGOTG PRGEHGQPGP SGERGLAGPP GREG I PGP LG PPGPPGSVGP PGASGLKGDK 2460 

GDPGVGI/PGP RGERGEPGIR GEDGRPGQEG PRGLTGPPGS RGERGEKGDV GSAGLKGDKG 2520 

DSAVILGPPG PRGAKGDMGE RGPRGLDGDK GPRGDNGDPG DKGSKGEPGD KGSAGLPGLR 2580 

GLLGPQGQPG AAGIPGDPGS PGKDGVPGIR GEKGDVGFMG PRGLKGERGV KGACGLDGEK 2640 

GDKGEAGPPG RPGLAGHKGE MGEPGVPGQS GAPGKEGLIG PKGDRGFDGQ PGPKGDQGEK 2700 

GERGTPGIGG FPGPSGNDGS AGPPGPPGSV GPRGPEGLQG QKGERGPPGE RWGAPGVPG 2760 

APGERGEQGR PGPAGPRGEK GEAALTEDDI RGFVRQEMSQ HCACQGQFIA SGSRPLPSYA 2820 

ADTAGSQLHA VPVLRVSHAE EEERVPPEDD EYSEYSEYSV EEYQDPEAPW DSDDPCSLPL 2880 

DEGSCTAYTL RWYHRAVTGS TEACHPFVYG GCGGNANRFG TREACERRCP PRWQSQGTG 2940 
TAQD 
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Seq ID NO: 64 DNA sequence 
Nucleic Acid Accession fh NMJ>06945 
Coding sequence: 1-219 

1 11 21 31 41 51 

I I I I I I 

ATGTCTTATC AACAGCAGCA GTGCAAGCAG CCCTGCCAGC CACCTCCTGT GTGCCCCACG 60 

CCAAAGTGCC CAGAGCCATG TCCACCCCCG AAGTGCCCTG AGCCCTGCCC ACCACCAAAG 120 

TGTCCACAGC CCTGCCCACC TCAGCAGTGC CAGCAGAAAT ATCCTCCTGT GACACCTTCC 180 
CCACCCTGCC AGCCAAAGTA TCCACCGAAG AGCAAGTAA 

Seq ID NO: 65 Protein sequence t 
Protein Accession 8: NP_008876 

1 11 21 31 41 51 

I I I I I I 

MSYQQQQCKQ PCQPPPVCPT PKCPEPCPPP KCPEPCPPPK CPQPCPPQQC QQKYPPVTPS 60 

PPCQPKYPPK SK 

Seq ID NO: 66 DNA sequence 

Nucleic Acid Accession #: NM_00S629.1 

Coding sequence: 639-2546 

1 11 21 31 41 51 

I I I I I I 

TAGTCGGAGC GAGGTGGCGA GTCGCTGAGC CCGCCGCGGC CCCGAGAGCG GCTGCAGCCG 60 

CCGCCGCCGG GAAGGAGAGG GCGAGGCGCG CCCGAGCCGC CGCCGCCGCC GCCACCGCCG 120 

CCGCCGCCAC CACCGCCACC GGAGTCGCGG GCCAGCCGGG CAGCCTCCGC GGGCCCCGGC 180 

CGGGGCGGGG GGCGCGGGCC ACAGGCCCCT GCTCCGGCCG TCGTTTGCAG ACCGCGGGCG 240 

CCGATGTCGC CCGCGCCCCG TTAGGATGAG TCTCGGGTCG GGCGAGGAGC CGCCGCAGCC 300 

GCCGCCGCCC GAGCCGCGGG CAGGAGCCTC GGGAGCCGCC GCCGCCGCCG CCGCCGCCCG 360 

GCCGGGCCCC GACGCCGCCC GCGCGCCCCC GGGCCCCCGA CACACATGAG ATTCTTCAGG 420 

CTCACTTTCA AGTGCTTCGT GGACTGCTTC TGACTGCGCC GCCCGCGCCC CGCACCCCGC 480 

CGTCCGCCCG CCGCCCCGTC CCCCGGCCCG GCCGCCCCCC GGCCCCCGGC CGGCCCGCGC 540 

CCTCGGGGCC CTCCCCGGTG CCGCCGGTGC CCCCCGCCTG ACCGCCGCCC CCCGTGAGGC 600 

GCCGCGACCC CGGCCCGGCC GTGCGGCCCG CCGGGGCCAT GGCGAAGAAG AGCGCCGAGA 660 

ACGGCATCTA TAGCGTGTCC GGCGACGAGA AGAAGGGCCC CCTCATCGCG CCCGGGCCCG 720 

ACGGGGCCCC GGCCAAGGGC GACGGCCCCG TGGGCCTGGG GACACCCGGC GGCCGCCTGQ 780 

CCGTGCCGCC GCGCGAGACC TGGACGCGCC AGATGGACTT CATCATGTCG TGCGTGGGCT 840 

TCGCCGTGGG CTTGGGCAAC GTGTGGCGCT TCCCCTACCT GTGCTACAAG AACGGCGGAG 900 

GTGTGTTCCT TATTCCCTAC GTCCTGATCG CCCTGGTTGG AGGAATCCCC ATTTTCTTCT 960 

TAGAGATCTC GCTGGGCCAG TTCATGAAGG CCGGCAGCAT CAATGTCTGG AACATCTGTC 1020 

CCCTGTTCAA AGGCCTGGGC TACGCCTCCA TGGTGATCGT CTTCTACTGC AACACCTACT 1080 

ACATCATGGT GCTGGCCTGG GGCTTCTATT ACCTGGTCAA GTCCTTTACC ACCACGCTGC 1140 

CCTGGGCCAC ATGTGGCCAC ACCTGGAACA CTCCCGACTG CGTGGAGATC TTCCGCCATG 120 0 

AAGACTGTGC CAATGCCAGC CTGGCCAACC TCACCTGTGA CCAGCTTGCT GACCGCCGGT 1260 

CCCCTGTCAT CGAGTTCTGG GAGAACAAAG TCTTGAGGCT GTCTGGGGGA CTGGAGGTGC 1320 

CAGGGGCCCT CAACTGGGAG GTGACCCTTT GTCTGCTGGC CTGCTGGGTG CTGGTCTACT 1380 

TCTGTGTCTG GAAGGGGGTC AAATCCACGG GAAAGATCGT GTACTTCACT GCTACATTCC 1440 

CCTACGTGGT CCTGGTCGTG CTGCTGGTGC GTGGAGTGCT GCTGCCTGGC GCCCTGGATG 1500 

GCATCATTTA CTATCTCAAG CCTGACTGGT CAAAGCTGGG GTCCCCTCAG GTGTGGATAG 1560 

ATGCGGGGAC CCAGATTTTC TTTTCTTACG CCATTGGCCT GGGGGCCCTC ACAGCCCTGG 1620 

GCAGCTACAA CCGCTTCAAC AACAACTGCT ACAAGGACGC CATCATCCTG GCTCTCATCA 1680 

ACAGTGGGAC CAGCTTCTTT GCTGGCTTCG TGGTCTTCTC CATCCTGGGC TTCATGGCTG 1740 

CAGAGCAGGG CGTGCACATC TCCAAGGTGG CAGAGTCAGG GCCGGGCCTG GCCTTCATCG 1800 

CCTACCCGCG GGCTGTCACG CTGATGCCAG TGGCCCCACT CTGGGCTGCC CTGTTCTTCT 1860 

TCATGCTGTT GCTGCTTGGT CTCGACAGCC AGTTTGTAGG TGTGGAGGGC TTCATCACCG 1920 

GCCTCCTCGA CCTCCTCCCG GCCTCCTACT ACTTCCGTTT CCAAAGGGAG ATCTCTGTGG 1980 

CCCTCTGTTG TGCCCTCTGC TTTGTCATCG ATCTCTCCAT GGTGACTGAT GGCGGGATGT 2040 

ACGTCTTCCA GCTGTTTGAC TACTACTCGG CCAGCGGCAC CACCCTGCTC TGGCAGGCCT 2100 

TTTGGGAGTG CGTGGTGGTG GCCTGGGTGT ACGGAGCTGA CCGCTTCATG GACGACATTG 2160 

CCTGTATGAT CGGGTACCGA CCTTGCCCCT GGATGAAATG GTGCTGGTCC TTCTTCACCC 2220 

CGCTGGTCTG CATGGGCATC TTCATCTTCA ACGTTGTGTA CTACGAGCCG CTGGTCTACA 2280 

ACAACACCTA CGTGTACCCG TGGTGGGGTG AGGCCATGGG CTGGGCCTTC GCCCTGTCCT 2340 

CCATGCTGTG CGTGCCGCTG CACCTCCTGG GCTGCCTCCT CAGGGCCAAG GGCACCATGG 2400 

CTGAGCGCTG GCAGCACCTG ACCCAGCCCA TCTGGGGCCT CCACCACTTG GAGTACOGAG 2460 

CTCAGGACGC AGATGTCAGG GGCCTGACCA CCCTGACCCC AGTGTCCGAG AGCAGCAAGG 2520 

TCGTCGTGGT GGAGAGTGTC ATGTGACAAC TCAGCTCACA TCACCAGCTC ACCTCTGGTA 2580 

GCCATAGCAG CCCCTGCTTC AGCCCCACCG CACCCCTCCA GGGGGCCTGC CTTTCCCTGA 2640 

CACTTTTGGG GTCTGCCTGG GGGAGGAGGG GAGAAAGCAC CATGAGTGCT CACTAAAACA 2700 

ACTTTTTCCA TTTTTAATAA AACGCCAAAA ATATCACAAC CCACCAAAAA TAGATGCCTC 2760 

TCCCCCTCCA GCCCTAGCCG AGCTGGTCCT AGGCCCCGCC TAGTGCCCCA CCCCCACCCA 2820 

CAGTGCTGCA CTCCTCCTGC CCCTGCCACG CCCACCCCCT GCCCACCTCT CCAGGCTCTG 2880 

CTCTGCAGCA CACCCGTGGG TGACCCCTCA CCCCAGAAGC AGCAGTGGCA GCTTGGGAAA 2940 

TGTGAGGAAG GGAAGGAGGG AGAGACGGGA GGGAGGAGAG AGAGGAGAAG GGAGGCAGGG 3000 

GAGGGGCAGC AGAACCAAGG CAAATATTTC AGCTGGGCTA TACCCCTCTC CCCATCCCTG 3060 

TTATAGAAGC TTAGAGAGCC AGCCAGCAAT GGAACCTTCT GGTTCCTGCG CCAATCGCCA 3120 

CCAGTATCAA TTGTGTGAGC TTGGGTGCGA GTGCACGCGT GCGTGAGTAC GGAGAGTATA 3180 

TATAGATCTC TATCTCTTAG CAAAGGTGAA TGCCAGATGT AAATGGCGCC TCTGGGCAAA 3240 

GGAGGCTTGT ATTTTGCACA TTTTATAAAA ACTTGAGAGA ATGAGATTTC TGCTTGTATA 3300 

TTTCTAAAAA GAGGAAQGAG CCCAAACCAT CCTCTCCTTA CCACTCCCAT CCCTGTGAGC 3360 

CCTACCTTAC CCCTCTGCCC CTAGCCAAGG AGTGTGAATT TATAGATCTA ACTTTCATAG 3420 

GCAAAACAAA AGCTTCGAGC TGTTGCGTGT GTGAGTCTGT TGTGTGGATG TGCGTGTGTG 3480 

GTCCCCAGCC CCAGACTGGA TTGGAAAAGT GCATGGTGGG GGCCTCGGGG CTGTCCCCAC 3540 

GCTGTCCCTT TGCCACAAGT CTGTGGGGCA AGAGGCTGCA ATATTCCGTC CTGGGTGTCT 3600 

GGGCTGCTAA CCTGGCCTGC TCAGGCTTCC CACCCTGTGC GGGGCACACC CCCAGGAAGG 3660 

GACCCTGGAC ACGGCTCCCA CGTCCAGGCT TAAGGTGGAT GCACTTCCCG CACCTCCAGT 3720 
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CTTCTGTGTA GCAGCTTTAA CCCACGTTTG TCTGTCACGT CCAGTCCCGA GACGGCTGAG 37 80 

TGACCCCAAG AAAGGCTTCC CCGACACCCA GACAGAGGCT GCAGGGCTGQ GGCTGGCTGA 3840 

GGGTGGCGGG CCTGCGGGGA CATTCTACTG TGCTAAAAAG CCACTGCAGA CATAGCAATA 3900 
AAAACATGTC ATTTTCC 

Seq ID NOi 67 Protein sequence i 
Protein Accession 8: NPJ>05620.1 

1 11 21 31 41 51 

I I I I I I 

MAKKSAENGI YSVSGDEKKG PLIAPGPDGA PAKGDGPVGL GTPGGRLAVP PRETKTRQMD 60 

FIMSCVGFAV GLGNVWRFPY LCYKNGGGVF LIPYVLIALV GGI PIFFLE I SLGQFMKAGS 120 

INVWNICPLF KGLGYASMVI VFYCNTYYIM VLAWGFYYLV KSFTTTLPWA TCGHTWNTPD 180 

CVEIFRHEDC ANASLANLTC DQLADRRSPV IEFWENKVLR LSGGLEVPGA LNWEVTLCLL 240 

ACWVLVYFCV WKGVKSTGKI VYFTATFPYV VLWLLVRGV LLPGALDGI I YYLKPDWSKL 300 

GSPQVWIDAG TQIFFSYAIG LGALTALGSY NRFNNNCYKD AIILALINSG TSFFAGFWF 360 

SILGFMAAEQ GVHISKVAES GPGLAFIAYP RAVTLMPVAP LWAALFFFML LLLGLDSQFV 420 

GVEGFITGLL DLLPASYYFR PQRBISVALC CALCFVIDLS MVTDGGMYVF QLFDYYSASG 480 

TTLLWQAFWE CVWAWVYGA DRFMDDIACM IGYRPCPWMK WCWSFFTPLV CKGIFIFNW 540 

YYEPLVYNNT YVYPWWGEAM GWAFALSSML CVPLHLLGCL LRAKGTMAER WQHLTQPIWG 600 
LHHLEYRAQD ADVRGLTTLT PVSESSKVW VESVM 



Seq ID NO: 68 DNA sequence 

Nucleic Acid Accession ft: NM_021953.l 

Coding sequence: 178-2469 

I 11 21 31 41 51 

1 I I I I I 

GGCACGAGGG GGACCCGGCC GGTCCGGCGC GAGCCCCCGT CCGGGGCCCT GGCTCGGCCC 60 

CCAGGTTGGA GGAGCCCGGA GCCCGCCTTC GGAGCTACGG CCTAACGGCG GCGGCGACTG 120 

CAGTCTGGAG GGTCCACACT TGTGATTCTC AATGGAGAGT GAAAACGCAG ATTCATAATG 180 

AAAGCTAGCC CCCGTCGGCC ACTGATTCTC AAAAGACGGA GGCTGCCCCT TCCTGTTCAA 240 

AATGCCCCAA GTGAAACATC AGAGGAGGAA CCTAAGAGAT CCCCTGCCCA ACAGGAGTCT 300 

AATCAAGCAG AGGCCTCCAA GGAAGTGGCG GAGTCCAACT CTTGCAAGTT TCCAGCTGGG 360 

ATCAAGATTA TTAACCACCC CACCATGCCC AACACGCAAG TAGTGGCCAT CCCCAACAAT 420 

GCTAATATTC ACAGCATCAT CACAGCACTG ACTGCCAAGG GAAAAGAGAG TGGCAGTAGT 480 

GGGCCCAACA AATTCATCCT CATCAGCTGT GGGGGAGCCC CAACTCAGCC TCCAGGACTC 540 

CGGCCTCAAA CCCAAACCAG CTATGATGCC AAAAGGACAG AAGTGACCCT GGAGACCTTG 600 

GGACCAAAAC CTGCAGCTAG GGATGTGAAT CTTCCTAGAC CACCTGGAGC CCTTTGCGAG 660 

CAGAAACGGG AGACCTGTGC AGATGGTGAG GCAGCAGGCT GCACTATCAA CAATAGCCTA 720 

TCCAACATCC AGTGGCTTCG AAAGATGAGT TCTGATGGAC TGGGCTCCCG CAGCATCAAG 780 

CAAGAGATGG AGGAAAAGGA GAATTGTCAC CTGGAGCAGC GACAGGTTAA GGTTGAGGAG 840 

CCTTCGAGAC CATCAGCGTC CTGGCAGAAC TCTGTGTCTG AGCGGCCACC CTACTCTTAC 900 

ATGGCCATGA TACAATTCGC CATCAACAGC ACTGAGAGGA AGCGCATGAC TTTGAAAGAC 960 

ATCTATACGT GGATTGAGGA CCACTTTCCC TACTTTAAGC ACATTGCCAA GCCAGGCTGG 1020 

AAGAACTCCA TCCGCCACAA CCTTTCCCTG CACGACATGT TTGTCCGGGA GACGTCTGCC 1080 

AATGGCAAGG TCTCCTTCTG GACCATTCAC CCCAGTGCCA ACCGCTACTT GACATTGGAC 1140 

CAGGTGTTTA AGCCACTGGA CCCAGGGTCT CCACAATTGC CCGAGCACTT GGAATCACAG 1200 

CAGAAACGAC CGAATCCAGA GCTCCGCCGG AACATGACCA TCAAAACCGA ACTCCCCCTG 1260 

GGCGCACGGC GGAAGATGAA GCCACTGCTA CCACGGGTCA GCTCATACCT GGTACCTATC 1320 

CAGTTCCCGG TGAACCAGTC ACTGGTGTTG CAGCCCTCGG TGAAGGTGCC ATTGCCCCTG 1380 

GCGGCTTCCC TCATGAGCTC AGAGCTTGCC CGCCATAGCA AGCGAGTCCG CATTGCCCCC 1440 

AAGGTGCTGC TAGCTGAGGA GGGGATAGCT CCTCTTTCTT CTGCAGGACC AGGGAAAGAG 1500 

GAGAAACTCC TGTTTGGAGA AGGGTTTTCT CCTTTGCTTC CAGTTCAGAC TATCAAGGAG 1560 

GAAGAAATCC AGCCTGGGGA GGAAATGCCA CACTTAGCGA GACCCATCAA AGTGGAGAGC 1620 

CCTCCCTTGG AAGAGTGGCC CTCCCCGGCC CCATCTTTCA AAGAGGAATC ATCTCACTCC 1680 

TGGGAGGATT CGTCCCAATC TCCCACCCCA AGACCCAAGA AGTCCTACAG TGGGCTTAGG 1740 

TCCCCAACCC GGTGTGTCTC GGAAATGCTT GTGATTCAAC ACAGGGAGAG GAGGGAGAGG 1800 

AGCCGGTCTC GGAGGAAACA GCATCTACTG CCTCCCTGTG TGGATGAGCC GGAGCTGCTC 1860 

TTCTCAGAGG GGCCCAGTAC TTCCCGCTGG GCCGCAGAGC TCCCGTTCCC AGCAGACTCC 1920 

TCTGACCCTG CCTCCCAGCT CAGCTACTCC CAGGAAGTGG GAGGACCTTT TAAGACACCC 1980 

ATTAAGGAAA CGCTGCCCAT CTCCTCCACC CCGAGCAAAT CTGTCCTCCC CAGAACCCCT 2040 

GAATCCTGGA GGCTCACGCC CCCAGCCAAA GTAGGGGGAC TGGATTTCAG CCCAGTACAA 2100 

ACCTCCCAGG GTGCCTCTGA CCCCTTGCCT GACCCCCTGG GGCTGATGGA TCTCAGCACC 2160 

ACTCCCTTGC AAAGTGCTCC CCCCCTTGAA TCACCGCAAA GGCTCCTCAG TTCAGAACCC 2220 

TTAGACCTCA TCTCCGTCCC CTTTGGCAAC TCTTCTCCCT CAGATATAGA CGTCCCCAAG 2280 

CCAGGCTCCC CGGAGCCACA GGTTTCTGGC CTTGCAGCCA ATCGTTCTCT GACAGAAGGC 2340 

CTGGTCCTGG ACACAATGAA TGACAGCCTC AGCAAGATCC TGCTGGACAT CAGCTTTCCT 2400 

GGCCTGGACG AGGACCCACT GGGCCCTGAC AACATCAACT GGTCCCAGTT TATTCCTGAG 2460 

CTACAGTAGA GCCCTGCCCT TGCCCCTGTG CTCAAGCTGT CCACCATCCC GGGCACTCCA 2520 

AGGCTCAGTG CACOCCAAGC CTCTGAGTGA GGACAGCAGG CAGGGACTGT TCTGCTCCTC 2580 

ATAGCTCCCT GCTGCCTGAT TATGCAAAAG TAGCAGTCAC ACCCTAGCCA CTGCTGGGAC 2640 

CTTGTGTTCC CCAAGAGTAT CTGATTCCTC TGCTGTCCCT GCCAGGAGCT GAAGGGTGGG 2700 

AACAACAAAG GGAATGGTGA AAAGAGATTA GGAACCCCCC AGCCTGTTTC CATTCTCTGC 2760 

CCAGCAGTCT CTTACCTTCC CTGATCTTTG CAGGGTGGTC CGTGTAAATA GTATAAATTC 2820 

TCCAAATTAT CCTCTAATTA TAAATGTAAG CTTATTTCCT TAGATCATTA TCCAGAGACT 2880 

GCCAGAAGGT GGGTAGGATG ACCTGGGGTT TCAATTGACT TCTGTTCCTT GCTTTTAGTT 2 940 

TTGATAGAAG GGAAGACCTG CAGTGCACGG TTTCTTCCAG GCTGAGGTAC CTGGATCTTG 3000 

GGTTCTTCAC TGCAGGGACC CAGACAAGTG GATCTGCTTG CCAGAGTCCT TTTTGCCCCT 3060 

CCCTGCCACC TCCCCGTGTT TCCAAGTCAG CTTTCCTGCA AGAAGAAATC CTGGTTAAAA 3120 

AAGTCTTTTG TATTGGGTCA GGAGTTGAAT TTGGGGTGGG AGGATGGATG CAACTGAAGC 3180 

AGAGTGTGGG TGCCCAGATG TGCGCTATTA GATGTTTCTC TGATAATGTC CCCAATCATA 3240 

CCAGGGAGAC TGGCATTGAC GAGAACTCAG GTGGAGGCTT GAGAAGGCCG AAAGGGCCCC 3300 

TGACCTGCCT GGCTTCCTTA GCTTGCCCCT CAGCTTTGCA AAGAGCCACC CTAGGCCCCA 3360 

GCTGACCGCA TGGGTGTGAG CCAGCTTGAG AACACTAACT ACTCAATAAA AGCGAAGGTG 3420 
GACCNAAAAA AAAAAAAAAA AAAA 
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WO 02/086443 

Seq ID NO: 69 Protein sequence: 
Protein Accession fl: Np_06B772.1 

1 11 21 31 41 51 

I I I I I I 

KKASPRRPLI LKRRRLPLPV QNAPSETSEE EPKRSPAQQE SNQAEASKEV AESNSCKFPA 60 

GIKI INHPTM PNTQWAIPN NANIHSIITA LTAKGKESGS SGPNKPILIS CGGAPTQPPG 120 

LRPQTQTSYD AKRTEVTLET LGPKPAARDV NLPRPPGALC EQKRETCADG EAAGCTINNS 180 

LSNIQWLRKM SSDGU5SRSI KQEMEEKENC KLEQRQVKVE EPSRPSAStiQ NSVSERPPYS 240 

YMAMIQFAIN STERKRMTUC DIYTWIEDHP PYFKHIAKPG WKNSIRHNLS LHDMPVRETS 300 

ANGKVSPWTI HPSANRYLTL DQVFKPLDPG SPQLPEHLES QQKRPNPELR RNMTIKTEIiP 360 

LGARRKMKPL LPRVSSYLVP IQFPVNQSLV LQPSVKVPLP LAASLMSSEL ARHSKRVRIA 420 

PKVLLAEEGI APLSSAGPGK EEKIiLFGEGP SPLLPVQTIK EEEIQPGEEM PHLARPIKVE 480 

SPPIiEEWPSP APSFKEESSH SWEDSSQSPT PRPKKSYSGL RSPTRCVSEM LVIQHRERRE 540 

RSRSRKKQHL LPPCVDEPEIi LFSEGPSTSR WAAELPFPAD SSDPASQLSY SQEVGGPFKT 600 

PIKETLPISS TPSKSVLPRT PESWRLTPPA KVGGLDFSPV QTSQGASDPL PDPLGLMDLS 660 

TTPLQSAPPL ESPQRLLSSE PLDLISVPFG NSSPSDIDVP KPGSPEPQVS GLAANRSLTE 720 
GLVLDTMNDS LSKILLDISF PGLDEDPLGP DNINWSQFIP ELQ 



Seq ID NO: 70 DNA sequence 

Nucleic Acid Accession #: BC0 0652 9.1 

Coding sequence: 178-2424 

1 11 21 31 41 51 

I I I I I I 

GGCACGAGGG GGACCCGGCC GGTCCGGCGC GAGCCCCCGT CCGGGGCCCT GGCTCGGCCC 60 

CCAGGTTGGA GGAGCCCGGA GCCCGCCTTC GGAGCTACGG CCTAACGGCG GCGGCGACTG 120 

CAGTCTGGAG GGTCCACACT TGTGATTCTC AATGGAGAGT GAAAACGCAG ATTCATAATG 180 

AAAACTAGCC CCCGTCGGCC ACTGATTCTC AAAAGACGGA GGCTGCCCCT TCCTGTTCAA 240 

AATGCCCCAA GTGAAACATC AGAGGAGGAA CCTAAGAGAT CCCCTGCCCA ACAGGAGTCT 300 

AATCAAGCAG AGGCCTCCAA GGAAGTGGCA GAGTCCAACT CTTGCAAGTT TCCAGCTGGG 360 

ATCAAGATTA TTAACCACCC CACCATGCCC AACACGCAAG TAGTGGCCAT CCCCAACAAT 420 

GCTAATATTC ACAGCATCAT CACAGCACTG ACTGCCAAGG GAAAAGAGAG TGGCAGTAGT 480 

GGGCCCAACA AATTCATCCT CATCAGCTGT GGGGGAGCCC CAACTCAGCC TCCAGGACTC 540 

CGGCCTCAAA CCCAAACCAG CTATGATGCC AAAAGGACAG AAGTGACCCT GGAGACCTTG 600 

GGACCAAAAC CTGCAGCTAG GGATGTGAAT CTTCCTAGAC CACCTGGAGC CCTTTGCGAG 660 

CAGAAACGGG AGACCTGTGC AGATGGTGAG GCAGCAGGCT GCACTATCAA CAATAGCCTA 720 

TCCAACATCC AGTGGCTTCG AAAGATGAGT TCTGATGGAC TGGGCTCCCG CAGCATCAAG 780 

CAAGAGATGG AGGAAAAGGA GAATTGTCAC CTGGAGCAGC GACAGGTTAA GGTTGAGGAG 840 

CCTTCGAGAC CATCAGCGTC CTGGCAGAAC TCTGTGTCTG AGCGGCCACC CTACTCTTAC 900 

ATGGCCATGA TACAATTCGC CATCAACAGC ACTGAGAGGA AGCGCATGAC TTTGAAAGAC 960 

ATCTATACGT GGATTGAGGA CCACTTTCCC TACTTTAAGC ACATTGCCAA GCCAGGCTGG 1020 

AAGAACTCCA TCCGCCACAA CCTTTCCCTG CACGACATGT TTGTCCGGGA GACGTCTGCC 1080 

AATGGCAAGG TCTCCTTCTG GACCATTCAC CCCAGTGCCA ACCGCTACTT GACATTGGAC 1140 

CAGGTGTTTA AGCAGCAGAA ACGACCGAAT CCAGAGCTCC GCCGGAACAT GACCATCAAA 1200 

ACCGAACTCC CCCTGGGCGC ACGGCGGAAG ATGAAGCCAC TGCTACCACG GGTCAGCTCA 1260 

TACCTGGTAC CTATCCAGTT CCCGGTGAAC CAGTCACTGG TGTTGCAGCC CTCGGTGAAG 1320 

GTGCCATTGC CCCTGGCGGC TTCCCTCATG AGCTCAGAGC TTGCCCGCCA TAGCAAGCGA 1380 

GTCCGCATTG CCCCCAAGGT GCTGCTAGCT GAGGAGGGGA TAGCTCCTCT TTCTTCTGCA 1440 

GGACCAGGGA AAGAGGAGAA ACTCCTGTTT GGAGAAGGGT TTTCTCCTTT GCTTCCAGTT 1500 

CAGACTATCA AGGAGGAAGA AATCCAGCCT GGGGAGGAAA TGCCACACTT AGCGAGACCC 1560 

ATCAAAGTGG AGAGCCCTCC CTTGGAAGAG TGGCCCTCCC CGGCCCCATC TTTCAAAGAG 1620 

GAATCATCTC ACTCCTGGGA GGATTCGTCC CAATCTCCCA CCCCAAGACC CAAGAAGTCC 1680 

TACAGTGGGC TTAGGTCCCC AACCCGGTGT GTCTCGGAAA TGCTTGTGAT TCAACACAGG 1740 

GAGAGGAGGG AGAGGAGCCG GTCTCGGAGG AAACAGCATC TACTGCCTCC CTGTGTGGAT 1800 

GAGCGGGAGC TGCTCTTCTC AGAGGGGCCC AGTACTTCCC GCTGGGCCGC AGAGCTCCCG 1860 

TTCCCAGCAG ACTCCTCTGA CCCTGCCTCC CAGCTCAGCT ACTCCCAGGA AGTGGGAGGA 1920 

CCTTTTAAGA CACCCATTAA GGAAACGCTG CCCATCTCCT CCACCCCGAG CAAATCTGTC 1980 

CTCCCCAGAA CCCCTGAATC CTGGAGGCTC ACGCCCCCAG CCAAAGTAGG GGGACTGGAT 2040 

TTCAGCCCAG TACAAACCCC CCAGGGTGCC TCTGACCCCT TGCCTGACCC CCTGGGGCTG 2100 

ATGGATCTCA GCACCACTCC CTTGCAAAGT GCTCCCCCCC TTGAATCACC GCAAAGGCTC 2160 

CTCAGTTCAG AACCCTTAGA CCTCATCTCC GTCCCCTTTG GCAACTCTTC TCCCTCAGAT 2220 

ATAGACGTCC CCAAGCCAGG CTCCCCGGAG CCACAGGTTT CTGGCCTTGC AGCCAATCGT 2280 

TCTCTGACAG AAGGCCTGGT CCTGGACACA ATGAATGACA GCCTCAGCAA GATCCTGCTG 2340 

GACATCAGCT TTCCTGGCCT GGACGAGGAC CCACTGGGCC CTGACAACAT CAACTGGTCC 2400 

CAGTTTATTC CTGAGCTACA GTAGAGCCCT GCCCTTGCCC CTGTGCTCAA GCTGTCCACC 2460 

ATCCCGGGCA CTCCAAGGCT CAGTGCACCC CAAGCCTCTG AGTGAGGACA GCAGGCAGGG 2520 

ACTGTTCTGC TCCTCATAGC TCCCTGCTGC CTGATTATGC AAAAGTAGCA GTCACACCCT 2580 

AGCCACTGCT GGGACCTTGT GTTCCCCAAG AGTATCTGAT TCCTCTGCTG TCCCTGCCAG 2640 

GAGCTGAAGG GTGGGAACAA CAAAGGCAAT GGTGAAAAGA GATTAGGAAC CCCCCAGCCT 2700 

GTTTCCATTC TCTGCCCAGC AGTCTCTTAC CTTCCCTGAT CTTTGCAGGG TGGTCCGTGT 2760 

AAATAGTATA AATTCTCCAA ATTATCCTCT AATTATAAAT GTAAGCTTAT TTCCTTAGAT 2820 

CATTATCCAG AGACTGCCAG AAGGTGGGTA GGATGACCTG GGGTTTCAAT TGACTTCTGT 2880 

TCCTTGCTTT TAGTTTTGAT AGAAGGGAAG ACCTGCAGTG CACGGTTTCT TCCAGGCTGA 2940 

GGTACCTGGA TCTTGGGTTC TTCACTGCAG GGACCCAGAC AAGTGGATCT GCTTGCCAGA 3000 

GTCCTTTTTG CCCCTCCCTG CCACCTCCCC GTGTTTCCAA GTCAGCTTTC CTGCAAGAAG 3060 

AAATCCTGGT TAAAAAAGTC TTTTGTATTG GGTCAGGAGT TGAATTTGGG GTGGGAGGAT 3120 

GGATGCAACT GAAGCAGAGT GTGGGTGCCC AGATGTGCGC TATTAGATGT TTCTCTGATA 3180 

ATGTCCCCAA TCATACCAGG GAGACTGGCA TTGACGAGAA CTCAGGTGGA GGCTTGAGAA 3240 

GGCCGAAAGG GCCCCTGACC TGCCTGGCTT CCTTAGCTTG CCCCTCAGCT TTGCAAAGAG 3300 

CCACCCTAGG CCCCAGCTGA CCGCA7GGGT GTGAGCCAGC TTGAGAACAC TAACTACTCA 3360 
ATAAAAGCGA AGGTGGAAAA AAAAAAAAAA AAAAAAA 

Seq ID NO: 71 Protein sequence: 
Protein Accession ft: AAH06S29.1 
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WO 02/086443 



1 11 21 31 41 51 

I I I I I I 

MKTSPRRPLI LKRRRLPLPV QNAPSETSEE EPKRSPAQQB SNQAEASKEV AESNSCKFPA 60 

GIKIINHPTM PNTQWAIPN NANIHSIITA LTAKGKESGS SGPNKFILIS CGGAPTQPPG 120 

LRPQTQTSYD AKRTEVTLET LGPKPAARDV NLPRPPGALC EQKRETCADG EAAGCTINNS 180 

LSNIQWLRKM SSDGLGSRSI KQEMBEKENC HLEQRQVKVE EPSRPSASWQ NSVSERPPYS 240 

YMAMIQFAIN STERKRMTLK DIYTWIEDHF PYPKHIAKPG WKNSIRHNLS LHDMFVRETS 300 

ANGKVSFWTI HPSANRYLTL DQVFKQQKRP NPELRRNMTI KTELPLGARR KMKPLLPRVS 360 

SYLVPIQPPV NQSLVLQPSV KVPLPLAASL MSSELARHSK RVRIAPKVLL AEEGIAPLSS 420 

AGPGKEBKLL FGEGPSPLLP VQTIKEEEIQ PGEEMPHLAR PIKVESPPLE EWPSPAPSFK 480 

EESSHSWEDS SQSPTPRPKK SYSGLRSPTR CVSEMLVIQH RERRERSRSR RKQHLLPFCV 540 

DEPELI*FSEG PSTSRWAAEL PFPADSSDPA SQLSYSQEVG GPFKTPIKET IiPISSTPSKS 600 

VLPRTPESWR LTPPAKVGGL DFSPVQTPQG ASDPLPDPLG LMDLSTTPLQ SAPPLESPQR 660 

LLSSEPLDLI SVPFGNSSPS DIDVPKPGSP EPQVSGLAAN RSLTEGLVLD TMNDSLSKIL 720 
LDISFPGLDE DPLGPDNINW SQFIPELQ 



Seq ID NO « 72 DNA sequence 
Nucleic Acid Accession fh U74612.1 
Coding sequences 178-2583 

1 11 21 31 41 51 

I I I I I I 

GGCACGAGGG GGACCCGGCC GGTCCGGQGC GAGCOCCCGT CCGGGGCCCT GGCTCGGCCC 60 

CCAGGTTGGA GGAGCCCGGA GCCCGCCTTC GGAGCTACGG CCTAACGGCG GCGGCGACTG 120 

CAGTCTGGAG GGTCCACACT TGTGATTCTC AATGGAGAGT GAAAACGCAG ATTCATAATG 180 

AAAACTAGCC CCCGTCGGCC ACTGATTCTC AAAAGACGGA GGCTGCCCCT TCCTGTTCAA ' 240 

AATGCCCCAA GTGAAACATC AGAGGAGOAA CCTAAGAQAT CCCCTGCCCA ACAGGAGTCT 300 

AATCAAGCAG AGGCCTCCAA GGAAGTGGCA GAGTCCAACT CTTGCAAGTT TCCAGCTGGG 360 

ATCAAGATTA TTAACCACCC CACCATGCCC AACACGCAAG TAGTGGCCAT CCCCAACAAT 420 

GCTAATATTC ACAGCATCAT CACAGCACTG ACTGCCAAGG GAAAAGAGAG TGGCAGTAGT 480 

GGGCCCAACA AATTCATCCT CATCAGCTGT GGGGGAGCCC CAACTCAGCC TCCAGGACTC 540 

CGGCCTCAAA CCCAAACCAG CTATGATGCC AAAAGGACAG AAGTGACCCT GGAGACCTTG 600 

GGACCAAAAC CTGCAGCTAG GGATGTGAAT CTTCCTAGAC CACCTGGAGC CCTTTGCGAG 660 

CAGAAACGGG AGACCTGTGC AGATGGTGAG GCAGCAGGCT GCACTATCAA CAATAGCCTA 720 

TCCAACATCC AGTGGCTTCG AAAGATGAGT TCTGATGGAC TGGGCTCCCG CAGCATCAAG 780 

CAAGAGATGG AGGAAAAGGA GAATTGTCAC CTGGAGCAGC GACAGGTTAA GGTTGAGGAG 840 

CCTTCGAGAC CATCAGCGTC CTGGCAGAAC TCTGTGTCTG AGCGGCCACC CTACTCTTAC 900 

ATGGCCATGA TACAATTCGC CATCAACAGC ACTGAGAGGA AGCGCATGAC TTTGAAAGAC 960 

ATCTATACGT GGATTGAGGA CCACTTTCCC TACTTTAAGC ACATTGCCAA GCCAGGCTGG 1020 

AAGAACTCCA TCCGCCACAA CCTTTCCCTG CACGACATGT TTGTCCGGGA GACGTCTGCC 1080 

AATGGCAAGG TCTCCTTCTG GACCATTCAC CCCAGTGCCA ACCGCTACTT GACATTGGAC 1140 

CAGGTGTTTA AGCCACTGGA CCCAGGGTCT CCACAATTGC CCGAGCACTT GGAATCACAG 1200 

CAGAAACGAC CGAATCCAGA GCTCCGCCGG AACATGACCA TCAAAACCGA ACTCCCCCTG 1260 

GGCGCACGGC GGAAGATGAA GCCACTGCTA CCACGGGTCA GCTCATACCT GGTACCTATC 1320 

CAGTTCCCGG TGAACCAGTC ACTGGTGTTG CAGCCCTCGG TGAAGGTGCC ATTGCCCCTG 1380 

GCGGCTTCCC TCATGAGCTC AGAGCTTGCC CGCCATAGCA AGCGAGTCCG CATTGCCCCC 1440 

AAGGTTTTTG GGGAACAGGT GGTGTTTGGT TACATGAGTA AGTTCTTTAG TGGCGATCTG 1500 

CGAGATTTTG GTACACCCAT CACCAGCTTG TTTAATTTTA TCTTTCTTTG TTTATCAGTG 1560 

CTGCTAGCTG AGGAGGGGAT AGCTCCTCTT TCTTCTGCAG GACCAGGGAA AGAGGAGAAA 1620 

CTCCTGTTTG GAGAAGGGTT TTCTCCTTTG CTTCCAGTTC AGACTATCAA GGAGGAAGAA 1680 

ATCCAGCCTG GGGAGGAAAT GCCACACTTA GCGAGACCCA TCAAAGTGGA GAGCCCTCCC 1740 

TTGGAAGAGT GGCCCTCCCC GGCCCCATCT TTCAAAGAGG AATCATCTCA CTCCTGGGAG 1800 

GATTCGTCCC AATCTCCCAC CCCAAGACCC AAGAAGTCCT ACAGTGGGCT TAGGTCCCCA 1860 

ACCCGGTGTG TCTCGGAAAT GCTTGTGATT CAACACAGGG AGAGGAGGGA GAGGAGCCGG 1920 

TCTCGGAGGA AACAGCATCT ACTGCCTCCC TGTGTGGATG AGCCGGAGCT GCTCTTCTCA 1980 

GAGGGGCCCA GTACTTCCCG r CTGGGCCGCA GAGCTCCCGT TCCCAGCAGA CTCCTCTGAC 2040 

CCTGCCTCCC AGCTCAGCTA CTCCCAGGAA GTGGGAGGAC CTTTTAAGAC ACCCATTAAG 2100 

GAAACGCTGC CCATCTCCTC CACCCCGAGC AAATCTGTCC TCCCCAGAAC CCCTGAATCC 2160 

TGGAGGCTCA CGCCCCCAGC CAAAGTAGGG GGACTGGATT TCAGCCCAGT ACAAACCTCC 2220 

CAGGGTGCCT CTGACCCCTT GCCTGACCCC CTGGGGCTGA TGGATCTCAG CACCACTCCC 2280 

TTGCAAAGTG CTCCCCCCCT TGAATCACCG CAAAGGCTCC TCAGTTCAGA ACCCTTAGAC 2340 

CTCATCTCCG TCCCCTTTGG CAACTCTTCT CCCTCAGATA TAGACGTCCC CAAGCCAGGC 2400 

TCCCCGGAGC CACAGGTTTC TGGCCTTGCA GCCAATCGTT CTCTGACAGA AGGCCTGGTC 2460 

CTGGACACAA TGAATGACAG CCTCAGCAAG ATCGTGCTGG ACATCAGCTT TCCTGGCCTG 2520 

GACGAGGACC CACTGGGCCC TGACAACATC AACTGGTCCC AGTTTATTCC TGAGCTACAG 2580 

TAGAGCCCTG CCCTTGCCCC TGTGCTCAAG CTGTCCACCA TCCCGGGCAC TCCAAGGCTC 2640 

AGTGCACCCC AAGCCTCTGA GTGAGGACAG CAGGCAGGGA CTGTTCTGCT CCTCATAGCT 2700 

CCCTGCTGCC TGATTATGCA AAAGTAGCAG TCACACCCTA GCCACTGCTG GGACCTTGTG 2760 

TTCCCCAAGA GTATCTGATT CCTCTGCTGT CCCTGCCAGG AGCTGAAGGG TGGGAACAAC 2820 

AAAGGCAATG GTGAAAAGAG ATTAGGAACC CCCCAGCCTG TTTCCATTCT CTGCCCAGCA 2880 

GTCTCTTACC TTCCCTGATC TTTGCAGGGT GGTCCGTGTA AATAGTATAA ATTCTCCAAA 2940 

TTATCCTCTA ATTATAAATG TAAGCTTATT TCCTTAGATC ATTATCCAGA GACTGCCAGA 3000 

AGGTGGGTAG GATGACCTGG GGTTTCAATT GACTTCTGTT CCTTGCTTTT AGTTTTGATA 3060 

GAAGGGAAGA CCTGCAGTGC ACGGTTTCTT CCAGGCTGAG GTACCTGGAT CTTGGGTTCT 3120 

TCACTGCAGG GACCCAGACA AGTGGATCTG CTTGCCAGAG TCCTTTTTGC CCCTCCCTGC 3180 

CACCTCCCCG TGTTTCCAAG TCAGCTTTCC TGCAAGAAGA AATCCTGGTT AAAAAAGTCT 3240 

TTTGTATTGG GTCAGGAGTT GAATTTGGGG TGGGAGGATG GATGCAACTG AAGCAGAGTG 3300 

TGGGTGCCCA GATGTGCGCT ATTAGATGTT TCTCTGATAA TGTCCCCAAT CATACCAGGG 3360 

AGACT6GCAT TGACGAGAAC TCAGGTGGAG GCTTGAGAAG GCCGAAAGGG CCCCTGACCT 3420 

GCCTGGCTTC CTTAGCTTGC CCCTCAGCTT TGCAAAGAGC CACCCTAGGC CCCAGCTGAC 3480 

CGCATGGGTG TGAGCCAGCT TGAGAACACT AACTACTCAA TAAAAGCGAA GGTGGACAAA 3540 
AAAAAAAAAA AAAAA 

Seq ID NO: 73 Protein sequence* 
Protein Accession #: AAC51128.1 
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WO 02/086443 



PCT/US02/12476 



1 11 21 31 41 51 

I I I I I I 

MKTSPRRPLI LKRRRLPLPV QNAPSETSBE EPKRSPAQQE SNQAEASKEV AESNSCKFPA 60 

5 GIKI INHPTM PNTQWAIPN NANIHSI ITA LTAKGKBSGS SGPNKFILIS CGGAPTQPPG 120 

IiRPQTQTSYD AKRTEVTLET LGPKPAARDV NLPRPPGALC EQKRETCADG RAAGCTINNS 180 

LSNIQWLRKM SSDGLGSRSI KQEMEEKENC HLEQRQVKVE EPSRPSASWQ NSVSERPPYS 240 

YMAMIQFAIN STERKRMTLK DIYTWIEDHP PYFKHIAKPG WKNSIRHNLS LHDMFVRETS 300 

ANGKVSPWTI HPSANRYLTL DOVFKPLDPG SPQLPEHLES QQKRPNPELR RNMTIKTELP 360 

10 LGARRKMKPL LPRVSSYLVP IQFPVNQSLV LQPSVKVPLP LAASLMSSEL ARHSKRVRIA 420 

PKVFGEQWF GYMSKFFSGD LRDFGTPITS LFNFIFLCLS VUAEEOIAP LSSAGPGKEE 480 

KLLFGEGFSP LLPVQTIKEB EIQPGEEMPH LARPIKVESP PLSEWPSPAP SFKEESSHSW 540 

EDSSQSPTPR PKKSYSGLRS PTRCVSEMLV IQHRERRSRS RSRRKQHLLP PCVDEPELLF 600 

SEGPSTSRWA AELPFPADSS DPASQLSYSQ EVGGPFKTPI KETLPISSTP SKSVLPRTPE 660 

15 SWRLTPPAKV GGLDFSPVQT SQGASDPLPD PLGLMDLSTT PLQSAPPLES PQRLLSSEPL 720 

DLISVPFGNS SPSDIDVPKP GSPEPQVSGb AANRSLTEGL VLDTMNDSLS KILLDISFPG 780 
LDEDPLGPDN INWSQFIPEL Q 

Seq ID NO: 74 DNA sequence 
20 Nucleic Acid Accession 8: Eos sequence 
Coding sequence: 111-416 

1 11 21 31 41 51 

oc I I I I I t 

2 J GGGAAGAGCC AGGCTGAGCC TTATAAAGGA CTGCTCTTTG TCCAAACACA CACATCTCAC 60 

TCATCCTTCT ACTCGTGACG CTTCCCAGCT CTGGCTTTTT GAAAGCAAAG ATGAGCAACA 120 

CTCAAGCTGA GAGGTCCATA ATAGGCATGA TCGACATGTT TCACAAATAC ACCAGACGTG 180 

ATGACAAGAT TGAGAAGCCA AGCCTGCTGA CGATGATGAA GGAGAACTTC CCCAACTTCC 240 

TTAGTGCCTG TGACAAAAAG GGCACAAATT ACCTCGCCGA TGTCTTTGAG AAAAAGGACA 300 

30 AGAATGAGGA TAAGAAGATT GATTTTTCTG AGTTTCTGTC CTTGCTGGGA GACATAGCCA 360 

CAGACTACCA CAAGCAGAGC CATGGAGCAG CGCCCTGTTC CGGGGGCAGC CAGTGACCCA 420 
GCCCCACCAA TGGGCCTCCA GAGACCCCAG GAACAATAAA ATGTCTTCTC CCACCAGA 
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Seq ID NO: 75 Protein sequence: 
Protein Accession #: Eos sequence 



1 11 21 31 41 51 

I I I I I I 

MSNTQAERSI IGMIDMFHKY TRRDDKIEKP SLLTMMKENF PNFLSACDKK GTNYLADVFE 60 
40 KKDKNEDKKI DFSEFLSLLG DIATDYHKQS HGAAPCSGGS Q 

Seq ID NO: 76 DNA sequence 
Nucleic Acid Accession Eos sequence 
45 Coding sequence: 111-416 

1 11 21 31 41 51 

I I I 1 I I 

GGGAAGAGCC AGGCTGAGCC TTATAAAGGA CTGCTCTTTG TCCAAACACA CACATCTCAC 60 

50 TCATCCTTCT ACTCGTGACA CTTCCCAGTT CTGGCTTTTT GAAAGCAAAG ATGAGCAACA 120 

CTCAAGCTGA GAGGTCCATA ATAGGCATGA TCGACATGTT TCACAAATAC ACCGGACGTG 180 

ATGGCAAGAT TGAGAAGCCA AGCCTGCTGA CGATGATGAA GGAGAACTTC CCCAATTTCC 240 

TCAGTGCCTG TGACAAAAAG GGCATACATT ACCTCGCCAC TGTCTTTGAG AAAAAGGACA 300 

AGAATGAGGA TAAGAAGATT GATTTTTCTG AGTTTCTGTC CTTGCTGGGA GACATAGCCG 360 

55 CAGACTACCA CAAGCAGAGC CATGGAGCGG CGCCCTGTTC TGGGGGAAGC CAGTGATCCA 420 
GCCCCACCAA GGGGCCTCCA GAGACCCCAG GAACAATAAG TGTCTCCTCC CACCAGA 



Seq ID NO: 77 Protein sequence: 
Protein Accession #: XPJJ48124.1 

1 11 21 31 41 51 

I I I I I I 

MSNTQAERSI IGMIDMFHKY TGRDGKIEKP SLLTMMKENF PNFLSACDKK GIHYLATVFE 60 
KKDKNEDKKI DFSEFLSLLG DIAADYHKQS HGAAPCSGGS Q 



Seq ID NO: 78 DMA sequence 
Nucleic Acid Accession #: Z73678. 
Coding sequence: 253-2433 



1 11 21 31 41 51 

I I I I I I 

GGGGTGGTGC AGGGCAGGGG TGGTATATCC TGTCTGACGG AGGGCGGGCC TCGCCAGTGC 60 

CAGAGAGGGA CGAACCAGGG TGGAAGCGCC AGGAGCAGCT GCAGGGAGCC CTCACGCGGA 120 

75 CCTCGCACTC TATGGCCGTA GGGAGCCGCT GAGAGCGAGA AGAGCACGCT CCTGCCCGCC 180 

CGCTGCACCG CACCTCGCCT CGCCTCTCTG CTCTCCTAGG CCCCGGCCGC GCGCCACCCG 240 

CCTCCCGCCA CCATGAACCA CTCGCCGCTC AAGACCGCCT TGGCGTACGA ATGCTTCCAG 300 

GACCAGGACA ACTCCACGTT GGCTTTGCCG TCGGACCAAA AGATGAAAAC AGGCACGTCT 360 

GGCAGGCAGC GCGTGCAGGA GCAGGTGATG ATGACCGTCA AGCGGCAGAA GTCCAAGTCT 420 

80 TCCCAGTCGT CCACCCTGAG CCACTCCAAT CGAGGTTCCA TGTATGATGG CTTGGCTGAC 480 

AATTACAACT ATGGGACCAC CAGCAGGAGC AG CT ACT ACT CCAAGTTCCA GGCAGGGAAT 540 

GGCTCATGGG GATATCCGAT CTACAATGGA ACCCTCAAGC GGGAGCCTGA CAACAGGCGC 600 

TTCAGCTCCT ACAGCCAGAT GGAGAACTGG AGCCGGCACT ACCCCCGGGG CAGCTGTAAC 660 

ACCACCGGCG CAGGCAGCGA CATCTGCTTC ATGCAGAAAA TCAAGGCGAG CCGCAGTGAG 720 

85 CCCGACCTCT ACTGTGACCC ACGGGGCACC CTGCGCAAGG GCACGCTGGG CAGCAAGGGC 780 

CAGAAGACCA CCCAGAACCG CTACAGCTTT TACAGCACCT GCAGTGGTCA GAAGGCCATA 840 

AAGAAGTGCC CTGTGCGCCC GCCCTCTTGT GCCTCCAAGC AGGACCCTGT GTATATCCCG 900 
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CCCATCTCCT GCAACAAGGA CCTGTCCTTT GGCCACTCTA GGGCCAGCTC CAAGATCTGC 960 

AGTGAGGACA TOGAGTSCAG TGGGCTGACC ATCCCCAAGG CTGTGCAGTA CCTGAGCTCC 1020 

CAGGATGAGA AGTACCAGGC CATTGGGGCC TATTACATCC AGCATACCTG CTTCCAGGAT 1080 

GAATCTGCCA AGCAACAGGT CTATCAGCTG GGAGGCATCT GCAAGCTGGT GGACCTCCTC 1140 

CGCAGCCCCA ACCAGAACGT CCAGCAGGCC GCGGCAGGGG CCCTGCGCAA CCTGGTGTTC 1200 

AGGAGCACCA CCAACAAGCT GGAGACCCGG AGGCAGAATG GGATCCGOGA GGCAGTCAGC 1260 

CTCCTGAGGA GAACCGGGAA CGCOGAGATC CAGAAGCAGC TGACTGGGCT GCTCTGGAAC 1320 

CTGTCTTCCA CTGAGGAGCT GAAGGAGGAA CTCATTGCCG ACGCCCTGCC TGTTCTGGCC 1380 

GACCGCGTCA TCATTCCCTT CTCTGGCTGG TGCGATGGCA ATAGCAACAT GTCCCGGGAA 1440 

GTGGTGGACC CTGAGGTCTT CTTCAATGCC ACAGGCTGCT TGAGGAACCT GAGCTCGGCC IS 00 

GATGCAGGCC GCCAGACCAT GGGTAACTAC TCAGGGCTCA TTGATTCCCT CATGGCCTAT 1560 

GTCCAGAACT GTGTAGCGGC CAGCCGCTGT GACGACAAGT CTGTGGAAAA CTGCATGTGT 1620 

GTTCTGCACA ACCTCTCCTA CCGCCTGGAC GCCGAGGTGC CCACCCGCTA CCGCCAGCTG 1680 

GAGTATAAOG CCOGCAACGC CTACACCGAG AAGTCCTCCA CTGGCTGCTT CAGCAACAAG 1740 

AGCGACAAGA TGATGAACAA CAACTATGAC TGCCCCCTGC CTGAGGAAGA GACCAACCCC 1800 

AAGGGCAGCG GCTGGTTGTA CCATTCAGAT GCCATCCGCA CCTACCTGAA CCTCATGGGC 1860 

AAGAGCAAGA AAGATGCTAC CCTGGAGGCC TGTGCTGGTG CCCTGCAGAA CCTGACAGCC 1920 

AGCAAGGGGC TGATGTCCAG TGGCATGAGC CAGTTGATTG GGCTGAAGGA AAAGGGCCTG 1980 

CCACAAATTG CCCGCCTCCT GCAATCTGGC AACTCTGATG TGGTGCGGTC CGGAGCCTCC 2040 

CTCCTGAGCA ACATGTCCCG CCACCCTCTG CTGCACAGAG TGATGGGGAA CCAGGTGTTC 2100 

CCGGAGGTGA CCAGGCTCCT CACCAGCCAC ACTGGCAATA CCAGCAACTC CGAAGACATC 2160 

TTGTCCTCGG CCTGCTACAC TGTGAGGAAC CTGATGGCCT CGCAGCCACA ACTGGCCAAG 2220 

CAGTACTTCT CCAGCAGCAT GCTCAACAAC ATCATCAACC TGTGCCGAAG CAGTGCCTCA 2280 

CCCAAGGCCG CAGAAGCTGC CCGGCTTCTC CTGTCTGACA TGTGGTCCAG CAAGGAACTG 2340 

CAGGGTGTCC TCAGACAGCA AGGTTTCGAT AGGAACATGC TGGGAACCTT AGCTGGGGCC 2400 

AACAGCCTCA GGAACTTCAC CTCCCGATTC TAAGAAGAGA CTGTCCAAGC AAGTTAGGCT 2460 

TGCAGGAAGA TATGACCCAG CTGAGAAGCC CTCAGGCCTC GCTGGATGGG GTTTTCTGTC 2520 

CATCCTGTGC AGTATTTGGG AAAGTTCACA AGAAACTGAG AAGAAACCTA AAAACTGTGG 2580 

ATAGTGGAAA GATTTTTAGA TTTTTTTTTT CCTTGGGGAA ACTGGCAGGC AATGGGGGTT 2640 

AGGGAGGTTG GGGCGGGGGG GGCTTTCTTG AGTTAAAGGG GCTTATATGT GATGTCAATA 2700 

TTTCTTCCTC TGAGAAATGG TATATATATG TGTCTAATGT AAGTGTGTGC ATGCATGTGC 2760 

GCGTGCATGT GTGTGTGTGT GAGTGTCTTA AAGCATAACC ACAAACTGCA AAAAGCTAGG 2820 

TAAGCTATTT TGTTGCAGCT CATAAGGTGG TGAAAAGGAC TCTCCTGTGT TTCTTACTCA 2880 

TAGGCAAGGA CAACATGTGC TTTTTGGTGA GCTGCTCATA ATTCCTGAAA TGTGTGGTGC 2940 

CAGGGCAAGG GGGCCATCAC TGCAGTCAGG CCCTCAGAGG AGTCCTGCAG GCTTCCTACC 3000 

AGTGGTCTCC AAGGGTGCAG GAGTAACTGG GGCTGGGCCA GCCTCCCCCC TTACAAGGCT 3060 

GCTTTCCACG AAGGGAGGTC TGGTGTATCT CATGGGAGAA TCTGGGGTGT CTGTAGTGTC 3120 

ACCCCTCCAG CAGCGCCACA AGGACTGAGG TTGGGTAGGT GTGAGGTTCC AGAGGACAGC 3180 

AGGACACTCT CGCATACTTT GCCAAATGAG GCCTGCTCAG AGGAGTAGGA GCTGAAAGAT 3240 

GGTGCCTTCC ACCCTCTTGG GCTGTGTGCC CATCAGAGCA GGCTCAGCCT GCAAAGGCCC 3300 

TGCATTCAGA GGTCTTGTAA TCTACTTGTT GCAGGAGAAA GAAGGTAAAA AATGATTTTT 3360 

TTAAGAAAAG CTATTTTATT GCAGCTCTTT CCCAAGAGCT GTTCTGGGAA TGGCTGGTCT 3420 

TCATATTCCC AGTGGAGAGG GGAACAAGTG GGGCTGGGCA TATACCTATT CCGGCTTCTA 3480 

GTGGGATGGA GTTGGGGTAT AGAAATTAAC CAGGAAGATG TTTCCACCAA GCCTGCTGTG 3540 

AGTCAATTGA GGGAGTGTTT GGGTCCCAGG AGACTTGGAC GGGGGGAGTT TGGGTAGACT 3600 

AGGAAAGGAA AGTGCCATAT CAGGGTACCG GTACCGGCAA GCTCACATCT CAGCCAGGGG 3660 

CCATGCCCCA CTTCCCCTGA CCCCAGCTGT CTTGTCTCCA CTCTGTGAAA CCCACAGGGG 3720 

ATGTGATAAA CAGGGCTATT AGGGGTATCA GCCACGTCGA GCCCCCAGAC TCTGTGCACT 3780 

TCAGACCAGC AGCAGCAGGA GGGCTCCCGA GGGCCTTATG AGAAAACCTG TGTGGACATC 3840 

CCTTGGTGTA CACTAAGACA GAGCAGAGCC CAGCGCTCCC AAGCCTTCCT CCTTCCAGCT 3900 

TCTACCTCCA TGCTAGCATT GCTGGTGTTA GAGAGGAATT AACTTCCTGG TCTGTGCCCT 3960 

TCTCTAGAAG AATATAAGAT GCTCCTCCTC CTCACCCCTT CTCAGCCTCC TCCCAAGTCT 4020 

TCCTCTTCTG CACCACCCCC GAGTCCAAAC CCACCTCTTG CCCCAGCATT CAGGCTGGAA 4080 

AACACTGATG TGGACTCAGT ATGACAACTG AGATGGGGGA AGCCAGACAT GTGAGGACGC 4140 

TGTCCTCCGA GAGGTGTCCC CGGCTGTTAG CCAGCTGTGC TGTGGTGCTG TGGGTCTGTC 4200 

ATACCCTCCC TTGCTTCTGT TCACACTGGG AGGCCCACTC CTGGCTCACC TCTCCCTCTC 4260 

AGGGACCCAC GTGGGAGCCT GGATCCCTGG ACTGTCCTGG GCATAGGTTT CAGGGGCCTC 4320 

CTTTGTTGTC ATCAGAACCC AGAGGAATTC TTGTCCTAAA AAATACGTAT GGCATACCAA 4380 

TCTGTGCGGG GCAGTGTCCT AAGCACTTAG ACTACATCAG GGAAGAACAC AGACCACATC 4440 

CCCGTCCTCA TGCGGCTTAT GTTTTCTGGA GGAAAGTGGA GACACAAGTC CTTGGCTTTA 4500 

GGGCTCCCCC GGCTGGGGGC TGTGCAGTCC GGTCAGGGCG GGAGGGGAAA TGCACCGCTG 4560 

CATGTGAACC TTACCAGCCC AGGCGGATGC CCCTTCCCCT TAGCACTACC CTGGCCTCCT 4620 

GCATCCCCTC GCCTCATGTT CCTCCCACCT TCAAAGAATG AAGAGCCCCA TGGGCCCAGC 4680 

CCCTGCCCTG GGAACCAGGC AGCCTTCCAG ACCTCAGGGG CTGAGGCAGA CTATTAGGGC 4740 

AGGGCTGACT TTGGTGACAC TGCCCATTCC CTCTCAGGCC AGCTCAGGTC ACCCGGGCCT 4800 

CTGACCCAGG CCTGTCACTT TGAGAGGGGC AAAACTGAGA GGGGCTTTTC CTAGAGAAAG 4860 

AGAACAAGGA GCTTGCCAGG CTTCATGTAG CCGACACACG TCTCAGGATT TTAAGTCCAC 4920 

ATTGGCCTCA CACTAGCCTA GGCCAATGCC CAAAATAAGG AGTTCCAATT TGGGGCCAAA 4980 

TGAGGAAGGA CACAGACTCT GCCCTGGGAT CTCCTGTGCT AGCGGCCAAT GACAAATCCA 5040 

GTCATTGGCC ACCAGCCACC TCTGCAGTGG GGACCACACT AGCAGCCCTG ACTCCACACT 5100 

CCTCCTGGGG ACCCAAGAGG CAGTGTTGCT GTCTGCGTGT CCACCTTGGA ATCTGGCTGA 5160 

ACTGGCTGGG AGGACCAAGA CTGCGGCTGG GGTGGGCAGG GAAGGGAAGC CGGGGGCTGC 5220 

• TGTGAGGGAT CTTGGAGCTT CCCTGTAGCC CACCTTCCCC TTGCTTCATG TTTGTAGAGG S280 

AACCTTGTGC CGGCCAGGCC CAGTTTCCTT GTGTGATACA CTAATGTATT TGCTTTTTTT 5340 
GGAAATAGAG AAAATCAATA AATTGCTAGT GTTTCTTTGA AAAAAAAAA 

Seq ID NO: 79 Protein sequence: 
Protein Accession #: CAA98022.1 

1 11 21 31 41 51 

I I I I I I 

MNHSPLKTAL AYECFQDQDN STLALPSDQK MKTGTSGRQR VQEQVMMTVK RQKSKSSQSS 60 

TLSHSNRGSM YDGLADNYNY GTTSRSSYYS KFQAGNGSWG YPIYNGTLKR EPDNRRFSSY 120 

SQMENWSRHY PRGSCNTTGA GSDICFMQKI KASRSEPDLY CDPRGTLRKG TLGSKGQKTT 180 

QNRYSPYSTC SGQKAIKKCP VRPPSCASKQ DPVY1PPISC NKDLSFGHSR ASSKICSED1 240 

ECSGLTIPKA VQYLSSQDEK YQAIGAYYIQ HTCFQDESAK QQVYQLGGIC KLVDLLRSPN 300 

QNVQQAAAGA LRNLVFRSTT NKLETRRQNG IREAVSLLRR TGNAEIQKQL TGLLWNLSST 360 
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DELKEELIAD ALPVLADRVI IPFSGWCDGN SNMSREWDP 
QTWRNYSGLI DSLMAYVQNC VAASRCDDKS VENCMCVLHN 
RNAYTEKSST GCFSNKSDKM MNNNYDCPLP EEETNPKGSG 
DATLEACAGA LQNLTASKGL MSSGMSQLIG LKEKGLPQIA 
MSRHPLLHRV MGNQVFPEVT RLLTSHTGNT SNSEDILSSA 
SSMLNNIINL CRSSASPKAA EAARLLLSDM WSSKELQGVL 
NFTSRF 



EVFFNATGCL RNLSSADAGR 420 

LSYRLDAEVP TRYRQLEYNA 480 

WLYHSDAIRT YLNLMGKSKK 540 

RLLQSGNSDV VRSGASLLSN 600 

CYTVRNLMAS QPQLAKQYFS 660 

RQQGFDRNML GTLAGANSLR 720 
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Seq ID NO i 80 DNA sequence 

Nucleic Acid Accession ft: NM_0 06516.1 

Coding sequence: 180-1658 



TAGTCGCGGG 
GTCAGAGTCG 
CGCACGCCGG 

. TGGAGCCCAG 
TTGGCTCCCT 
AGGAGTTCTA 
TCACCACGCT 
TCTCTGTGGG 
TGCTGGCCTT 
TGCTGATCCT 
CCATGTATGT 
AGCTGGGCAT 
GCAACAAGGA 
GCATCGTGCT 
AGAACCGGGC 
TGCAGGAGAT 
AGCTGTTCCG 
CCCAGCAGCT 
CGGGGGTGCA 
CTGTCGTGTC 
TCGCTGGCAT 
TACCCTGGAT 
TGGGTCCTGG 
CAGCTGCCAT 
GCTTCCAGTA 
TGGTTCTGTT 
ATGAGATCGC 
AGCTGTTCCA 
GCCTGCTCCC 
AACCTGACAG 
CCAGAAGAAT 
AAATCTATTC 
ATATCAGCCT 
GAGGGTGGAG 
CTGGACCTAT 
GAGGTGGCTA 
CATTAGGATT 
CCTGAGACCA 

. GCCGGGTTCT 
GGGAGCCTGC 
TGCAAGATAT 
ATATCTGGAC 
TATAAATGGC 
TTTGGATGGG 
GACTCAGGAT 
TTTGATCCCT 
ATCACATATT 
AGGCTTGAAA 



11 
I 

TCCCCGAGTG 
CAGTGGGAGT 
TCGCCACCCG 
CAGCAAGAAG 
GCAGTTTGGC 
CAACCAGACA 
CTGGTCCCTC 
CCTTTTCGTT 
CGTGTCCGCC 
GGGCCGCTTC 
GGGTGAAGTG 
CGTCGTCGGC 
CCTGTGGCCC 
GCCCTTCTGC 
CAAGAGTGTG 
GAAGGAAGAG 
CTCCCCCGCC 
GTCTGGCATC 
GCAGCCTGTG 
GCTGTTTGTG 
GGCGGGTTGT 
GTCCTATCTG 
CCCCATCCCA 
TGCCGTTGCA 
TGTGGAGCAA 
CTTCATCTTC 
TTCCGGCTTC 
TCCCCTGGGG 
AGCAGCCCTA 
ATGTCAGCCG 
ATTCAGGACT 
AGACAAGCAA 
GAGTCTCCTG 
ACTAAGCCCT 
GTCCTAAGGA 
TGGCCACCCG 
TGCCCCTTCC 
GTTGGGAGCA 
AGTCTCCTTT 
AAACTCACTG 
TTATATATAT 
AAGCCAACTT 
TGGTTTTTAG 
AGTGAGACAG 
CCAGTCCCTT 
GTTACCCAGA 
TGATAGTTGG 
TCGCATTATT 



21 
I 

AGCACGCCAG 
CCCCGGACCG 
CGTACCCGGC 
CTGACGGGTC 
TACAACACTG 
TGGGTCCACC 
TCAGTGGCCA 
AACCGCTTTG 
GTGCTCATGG 
ArCATCGGTG 
TCACCCACAG 
ATCCTCATCG 
CTGCTGCTGA 
CCCGAGAGTC 
CTAAAGAAGC 
AGTCGGCAGA 
TACCGCCAGC 
AACGCTGTCT 
TATGCCACCA 
GTGGAGCGAG 
GCCATACTCA 
AGCATCGTGG 
TGGTTCATCG 
GGCTTCTCCA 
CTGTGTGGTC 
ACCTACTTCA 
CGGCAGGGGG 
GCTGATTCCC 
AGGATCTCTC 
AGCCGGGCCT 
TAACGGCTCC 
CAGGTTTTAT 
TGCCCACATC 
GTCGAGACAC 
CACACTAATC 
TTCTGCTGGC 
CATCTCTTCC 
CTGGAGTGCA 
GCACTGAGGG 
CTCAAGAAGA 
TTTTGGTTGT 
GTAAATACAC 
AAACATGGTT 
AAGTAAGTGG 
ACACGTACCT 
GAATATATAC 
TGTTCAAAAA 
TTGAATGTGA 



31 
I 

GGAGCAGGAG 
GAGCACGAGC 
GCAGCCAGAG 
GCCTCATGCT 
GAGTCATCAA 
GCTATGGGGA 
TCTTTTCTGT 
GCCGGCGGAA 
GCTTCTCGAA 
TGTACTGCGG 
CCTTTCGTGG 
CCCAGGTGTT 
GCATCATCTT 
CCCGCTTCCT 
TGCGCGGGAC 
TGATGCGGGA 
CCATCCTCAT 
TCTATTACTC 
TTGGCTCCGG 
CAGGCCGGCG 
TGACCATCGC 
CCATCTTTGG 
TGGCTGAACT 
ACTGGACCTC 
CCTACGTCTT 
AAGTTCCTGA 
GAGCCAGCCA 
AAGTGTGAGT 
AGGAGCACAG 
GGGGCTCCTT 
AGGATTTTAA 
AATTTTTTTA 
CCAGGCTTCA 
TTGCCTTCTT 
GAACTATGAA 
CTGGATCTCC 
TACCCAACCA 
GGGAGGAGAG 
CCACACTATT 
CATGGAGACT 
CAATATTAAA 
CACCTCACTC 
TTGAAATGCT 
GGTTGCAACC 
CTCATCAGTG 
ATTCTTTATC 
AACACTAGTT 
AGGGAA 



41 
I 

ACCAAACGAC 
CTGAGCGGGA 
CCACCAGCGC 
GGCTGTGGGA 
TGCCCCCCAG 
GAGCATCCTG 
TGGGGGCATG 
TTCAATGCTG 
ACTGGGCAAG 
CCTGACCACA 
GGCCCTGGGC 
CGGCCTGGAC 
CATCCCGGCC 
GCTCATCAAC 
AGCTGACGTG 
GAAGAAGGTC 
CGCTGTGGTG 
CACGAGCATC 
TATCGTCAAC 
GACCCTGCAC 
GCTAGCACTG 
CTTTGTGGCC 
CTTCAGCCAG 
AAATTTCATT 
CATCATCTTC 
GACTAAAGGC 
AAGTGATAAG 
CGCCCCAGAT 
GCAGCTGGAT 
TCTCCAGCCA 
CAAAAGCAAG 
TTACTGATTT 
CCCTGAATGG 
CACCCAGCTA 
CTACAAAGCT 
CCACTCTAGG 
CTCAAATTAA 
GGGAAGGGCC 
ACCATGAGAA 
CCTGCCCTGT 
TACAGACACT 
CTGTTACTTA 
TGTGGATTGA 
ACTGCAACGG 
TCCTCTTGCT 
TTGACATTCA 
TTGTGCCAGC 



51 

I 

GGGGGTCGGA 
GAGCGCOGCT 
AGCGCTGCCA 
GGAGCAGTGC 
AAGGTGATCG 
CCCACCACGC 
ATTGGCTCCT 
ATGATGAACC 
TCCTTTGAGA 
GGCTTCGTGC 
ACCCTGCACC 
TCCATCATGG 
CTGCTGCAGT 
CGCAACGAGG 
ACCCATGACC 
ACCATCCTGG 
CTGCAGCTGT 
TTCGAGAAGG 
ACGGCCTTCA 
CTCATAGGCC 
CTGGAGCAGC 
TTCTTTGAAG 
GGTCCACGTC 
GTGGGCATGT 
ACTGTGCTCC 
CGGACCTTCG 
ACACCCGAGG 
CACCAGCCCG 
GAGACTTCCA 
GCAATGATGT 
ACTGTTGCTC 
TGTTATTTTT 
TTCCATGCCT 
ATCTGTAGGG 
TCTATCCCAG 
GGTCAGGCTC 
TCTTTCTTTA 
AGTCTGGGCT 
GAGGGCCTGT 
TGTGTATAGA 
AAGTTATAGT 
CCTAAACAGA 
GGGTAGGAGG 
CTTAGACTTC 
CAAAAATCTG 
AGGCATTTCT 
CGTGATGCTC 



Seq ID NO: 81 Protein sequence: 
Protein Accession #: NP_006507.1 



MEPSSKKLTG 
LTTLWSLSVA 
MLILGRFIIG 
GNKDLWPLLL 
LQEMKEESRQ 
AGVQQPVYAT 
LPWMSYLSIV 
CFQYVEQLCG 
ELFHPLGADS 



11 

I 

RLMLAVGGAV 
IFSVGGMIGS 
VYCGLTTGFV 
SIIFIPALLQ 
MMREKKVTIL 
IGSGIVNTAF 
AIFGFVAFFE 
PYVFIIFTVL 
QV 



21 
1 

LGSLQFGYNT 
FSVGLFVNRF 
PMYVGEVSPT 
CIVIiPFCPES 
ELFRSPAYRQ 
TWSLFWER 
VGPGPIPWFI 
LVLFFIFTYF 



31 

I 

GVINAPQKVI 
GRRNSMLMMN 
AFRGALGTLH 
PRFLLINRNE 
PILIAWLQL 
AGRRTLHLIG 
VAELFSQGPR 
KVPETKGRTF 



41 

I 

BEFYNQTWVH 
LLAFVSAVLM 
QLGIWGILI 
ENRAKSVLKK 
SQQLSGINAV 
LAGMAGCAIL 
PAAIAVAGFS 
DEIASGFRQG 



51 

I 

RYGESILPTT 
GFSKLGKSFE 
AQVFGLDSIM 
LRGTADVTHD 
FYYSTSIFEK 
MTIALALLEQ 
NVrrSNFIVGM 
GASQSDKTPE 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1S0O 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 



60 
120 
180 
240 
300 
360 
420 
480 



Seq ID NO: 82 DNA sequence 
Nucleic Acid Accession g: BC001291 
Coding sequence: 44-541 

1 11 21 31 41 51 

I I I I I I 

GGGGGCGCCG CGCGCTGACC CTCCCTGGGC ACCGCTGGGG ACGATGGCGC TGCTCGCCTT 
GCTGCTGGTC GTGGCCCTAC CGCGGGTGTG GACAGAOGCC AACCTGACTG CGAGACAACG 



60 
120 
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AGATCCAGAG GACTCCCAGC 
TGAGAGAGAA AACACTTTCO 
CTGCGTTATA GCGGCCGTGA 
CGCTGGTTGT GCAGCGATGG 
GCCCATGCCC TTCTTTTACC 
ACCTATCAAC TCATCAGTGT 
GCTGTGGCTG GCCATCCTCC 
AGCCACGGGA CTGCCACAGA 
ACCTGTTGCA TTAAACTTGT 
GGGATGGGAG AGTGGGGATC 
ACATTCAGAG GAAGTCCAGA 
AAATCAAACC TTGTAACTCA 
CCTCTGAGGG CTTCAGTATT 
TGCTGAGATG CTTCCGACCT 
GGGTGAAGAC ATCCCTGGAG 
AGGGCTGCCC CCATTCCAGT 
CTACCAGATT CCAGGAGGCA 
ACCAGCTGGC ACAGGTGCAC 
ACTTAGGCCA AGTAGAGAGC 
CATCCATGGG GAGCTGAGAA 
TTCAAAAGTT CACGAAAAAA 



PCT/US02/12476 



GAACGGACGA 
AGTGCCAGAA 
AAATATTTCC 
AGAGACCCAA 
TCAAGTGTTG 
TCAAAGAATA 
TGCTGCTGGC 
CTGAGCCTTC 
TTTCTGTTGA 
AGGTGCAGTT 
TCTCCTGAGT 
TTTATTGCTG 
GATGGGGAGG 
TTCAGGTGAC 
TGAAGGACTC 
GGTGGhGGCG 
GAAGATAACT 
AGATTCATAA 
ATCAGGGTAA 
ATCAGACTCA 
AAAAAAAAAA 



GGGTGACAAT 
CCCAAGGAGG 
AOGTTTTTTC 
GCCAGAGGAG 
TAAAATTCGC 
TGCTGGGAGC 
CTCCATTGCA 
CGGAGCATGG 
TTACCTCTTG 
GGCTCTTAAC 
AGTGATTTTG 
ATGGCCACTC 
GAGGCCTAAG 
GCAGGAACAC 
CTCAGCATGG 
CTGTGGATGG 
AATTGTGTTG 
ATTCCCACAC 
ATGGCGTTCA 
AAGTTCCACC 
AAAAAAAAAA 



AGAGTGTGGT 
TGCAAATGGA 
ATGGTTGCGA 
AAGCGGTTTC 
TACTGCAATT 
ATGGG TGAGA 
GCCGGCCTCA 
ACTCGCTCCA 
GTTTGACTTC 
CCTCAAGGGT 
GTGACAAGTT 
TTTTCCTTGA 
TACCACTCAT 
TGGGGGAGTC 
GGGGCAGTGG 
CTGCTTTTCC 
AAGAAACTTA 
GTGTGTGTTC 
TTTCTCTGTT 
AAAAACAAAT 
AAAAAAAAAA 



GTCATGTTTG 
CAGAGCCATA 
AGCAGTGCTC 
TCCTGGAAGA 
TAGAGGGGCC 
GCTGTGGTGG 
GCCTGTCTTG 
GACCGTTGTC 
CCAGGGTCTT 
TCTTTAACTC 
TTTCTCTTTG 
CTCCCCTCTG 
GGAGAGTATG 
TGAATGATTG 
GGCACACGTT 
TCAACCTTTC 
GACTTCACCC 
AACATCTGAA 
AAGATGCAGC 
ACAAGGGGAC 
AAA 



Seq ID NO: B3 Protein sequence: 
Protein Accession ft: AAH01291 



11 



21 



41 



51. 



31 

I i i 

MALLALLLW ALPRVWTDAN LTARQRDPED SQRTDEGDNR VWCHVCEREN TFECQNPRRC 
KWTEPYCVIA AVKIFPRPFM VAKQCSAGCA AMERPKPEBK RFLLEEPMPP FYLKCCKIRY 
CNLEGPPINS SVFKEYAGSM GESCGGLWLA ILLLLASIAA GLSLS 

Seq ID NO t B4 DNA sequence 

Nucleic Acid Accession #: NM_022B93.1 

Coding sequence: 229-2726 



I 

TTTTTTTTTT 
TGCGCCATCT 
TTTTCTCTGG 
CGCCCGCCGC 
AAGCAAGGCA 
ATTCTTACAG 
CTCCTCACCT 
GAGCACAAAC 
CCTTCCCCTT 
GTCACGCCAG 
GAACACATAG 
GGAGCTCTAA 
GATGAGCCCA 
CTCTTGCAAC 
AGTCCCCTGA 
CCACCTCTCC 
GGATCAGTAT 
CTGTTTAGTC 
GAAGAAATGG 
CCAATGGCTA 
AACACGTCTA 
CCATTCCAGC 
TCCGCCCCTC 
ACGTTCAAAT 
TACAAGTGCA 
AAGACGCACA 
GCCAGCTCCC 
TCCGTGGTGG 
GAGGAGGAAG 
CTGACGGAGA 
CACGAGAACA 
GACGTCATGC 
GTCCTGGGCG 
TGCGACGAAG 
CGCGGCTGCT 
AGCCCCAGCT 
CCCCCGGCCA 
GCCTCCAGGC 
GCCTCCTCGT 
GAGCTGGACG 
ATTAGTGGTC 
GAGTACTGTG 
ACGGGCGAAA 
CTCACCAGGC 
TGTAAGATGC 
GATCGAGTGT 
CTCCCACCTG 
CCTGTAGGAT 
ACGAAGCTAA 
TTCTTTTTTC 



11 
I 

TTTTTTGCTT 
TTGTATTATT 
AGTCTCCTTC 
CGCCGCCGCC 
AACCCCAGCA 
ATGATGAACC 
GTGGGCAGTG 
GGAAACAATG 
CACCAATCGA 
AAGATGACGA 
CAGATAAACT 
TCCCCACGCC 
GCAGCTACAC 
ACGCACAGAA 
CCCCGCGGGT 
ATGGGATTCA 
CGAGAGAGGC 
CACCACCGAG 
CCCTGGCCAC 
TGGAGCCTCC 
GCCCACCGCT 
CAGGTAGCAA 
CTCCCTCCCA 
TTCAGAGCAA 
ACCTGTGCGA 
TGCACAAATC 
CGGAACCCGG 
CCAAGTTCAA 
AGGAGGACGA 
GCGAGAGGGT 
GCTCGCGGGG 
AGGGCATGGT 
AGAAGCATAA 
ACTCGGTGGC 
CCCCGGGCGA 
CGCTGAGCCC 
CGATGCCCAA 
AGCTCAAAGA 
CGGAGCACTC 
GAGGGATCTC 
CGGGCACGGG 
GGAAAGTCTT 
GGCCTTATAA 
ACATGAAAAC 
CTTTTAGCGT 
TGAATAATGA 
ACACCCCCTT 
TTTTTTCTAG 
GAATATGAGA 
TTTTTCCTTT 



21 
I 

AAAAAAAAGC 
TCTAATTTAT 
TTTCTAACCC 
GCCGCCGCCG 
CTTAAGCAAA 
AGACCACGGC 
CCAGATGAAC 
CAATGGCAGC 
GATGAAAAAA 
TTGTTTATCA 
TCTGCACTGG 
TGGGATGAGT 
ATGTACAACT 
CACTCATGGA 
TGGTATCCCT 
TATTGCAGAC 
TTCCGGCCTG 
ACATCACTTG 
CCATCACCCG 
CGCCATGGAT 
GTCCCCAGGC 
GCCGCCCTTC 
GCCCCCGGTC 
CCTGGTGGTG 
CCACGCGTGC 
GTCCCCCATG 
CACCAGCGAC 
GAGCGAGAAC 
CGAGGAAGAG 
GGACTACGGC 
CGCGGTCGTG 
GCTCAGCTCC 
GCGCGGCCAC 
CGGCGAGTCG 
GTCGGCCTCG 
CTTCTCTAAG 
CACGGAGAAC 
TCCCTTCCTT 
CTCGGAGAAC 
GGGGCGCAGC 
CAGGCCCAGC 
CAAGAACTGT 
ATGCGAGCTG 
GCATGGCCAG 
GTACAGTACC 
TATAAAAACT 
TTTCACCACT 
TCCCATGTGA 
GTGCTTGTCA 
TTTTTTTTTT 



31 
I 

CATGACGGCT 
TTTGGATGTC 
GGCTCTCCCG 
CCCGCCCCGC 
CGGGAATTCT 
CCGTTGGGAG 
TTCCCATTGG 
CTCTGCTTAG 
GCATCCAATC 
ACGTCATCTA 
AGGGGCCTCT 
GCAGAATATG 
TGCAAACAGC 
TTAAGAATCT 
TCAGGACTAG 
AATAACCCCT 
GCAGAAGGGC 
GACCCCCACC 
AGTGCCTTTG 
TTCTCTAGGA 
CGGCCCAGCC 
CTGGCGACGC 
AAGTCCAAGT 
CACCGGCGCA 
ACCCAGGCCA 
ACGGTCAAGT 
TTGGTGGGCA 
GACCCCAACC 
GAAGAAGAGG 
TTCGGGCTGA 
GGCGTGGGCG 
ATGCAGCACT 
CTGGCCGAGG 
GACCGCATAG 
GGGGGCCTGT 
CGCATCAAGC 
GTGTACTCGC 
AGCTTCGGAG 
GGGAGCTTGC 
GGCACGGGAA 
TCAAAAGAGG 
AGCAATCTCA 
TGCAACTATG 
GTGGGGAAGG 
CTGGAGAAAC 
GAATAGAGGT 
CCCTTTCCCC 
TTTAAACAAA 
CCAGCACACC 
TCCTTTATGT 



41 
I 

CTCCCACAAT 
AAAAGGCACT 
ATGTGAACCG 
AGCCCACCAT 
CGCCCGAGCC 
CTCCAGAAGG 
GGGACATTCT 
AAAAAGCTGT 
CCGTGGAGGT 
GAAGAATTTG 
CCTCCCCTCG 
CCCCGCAGGG 
CATTCACCAG 
ACTTAGAAAG 
GTGCAGAATG 
TTAACCTGCT 
GCTTTCCACC 
GCATAGAGCG 
ACAGGGTGCT 
GACTTAGAGA 
CTATGCAAAG 
CCCCCCTCCC 
CATGCGAGTT 
GCCACACGGG 
GCAAGCTGAA 
CCGACGACGG 
GCGCCAGCAG 
TGATCCCGGA 
AGGAAGAGGA 
GCCTGGAGGC 
ACGAGAGCCG 
TCAGCGAGGC 
CCGAGGGCCA 
ACGATGGCAC 
CCAAAAAGCT 
TCGAGAAGGA 
AGTGGCTCGC 
ACTCCAGACA 
GCTTCTCCAC 
GTGGAGGGAG 
GCAGACGCAG 
CTGTCCACAG 
CCTGTGCCCA 
ACGTTTACAA 
ACATGAAAAA 
ATATTAATAC 
ATCGCCCTCC 
CAAACAAACA 
TGTTTTTTTT 
TCTCACCGTT 



180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



60 
120 



51 
I 

TCATCTTCCC 
GATGAAGATA 
AGCCGTCGTC 
GTCTCGCCGC 
TCTTGAAGCC 
GGATCATGAC 
TATTTTTATC 
GGATAAGCCA 
TGGCATCCAG 
CCCCAAACAG 
TTCTGCACAT 
TATTTGTAAA 
TGCATGGTTT 
CGAACACGGA 
TCCTTCCCAG 
AAGAATACCA 
CACTCCCCCC 
CCTGGGGGCG 
GCGGTTGAAT 
GCTGGCAGGG 
GTTACTGCAA 
TCCTCTGCAA 
CTGCGGCAAG 
CGAGAAGCCC 
GCGCCACATG 
TCTCTCCACC 
CGCGCTCAAG 
GAACGGGGAC 
GGAGGAGGAG 
GGCGCGCCAC 
CGCCCTGCCC 
CTTCCACCAG 
CAGGGACACT 
TGTTAATGGC 
GCTGCTGGGC 
GTTCGACCTG 
CGGCTACGCG 
ATCGCCTTTT 
ACCGCCCGGG 
CACGCCCCAT 
CGACACTTGT 
GAGAAGCCAC 
GAGTAGCAAG 
ATGTGAAATT 
ATGGCACAGT 
CCCTCCCTCA 
AGCCCCACTC 
AACAGAAGTA 
CTTTTTCTTT 
TGAATGCATG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
22B0 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
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ATCTGTATGG 66CAATACTA TTGCATTTTA OGCAAACTTT GAGCCTTTCT CTTGTGCAAT 3060 

AATTTACATQ TTGTGTATGT TTTTTTTTAA ACTTAGACAG CATGTATGGT ATGTTATGGC 3120 

TATTTTAAAT TGTCCCTAAT TCGTTGCTGA GCAAACATGT TGCTGTTTCC AGTTCCGTTC 3180 

TGAGAGAAAA AGAGAGAGAG AGAGAAAAAG ACCATGCTGC ATACATTCTG TAATACATAT 3240 

CATGTACAGT TTTATTTTAT AACGTGAGGA GGAAAAACAG TCTTTGGATT AACCCTCTAT 3300 

AGACAGAATA GATAGCACTG AAAAAAAATC TCTATGAGCT AAATGTCTGT CTCTAAAGGG 3360 

TTAAATGTAT CAATTGGAAA GGAAGAAAAA AGGCCTTGAA TTGACAAATT AACAGAAAAA 3420 

CAGAACAAGT TTATTCTATC ATTTGGTTTT AAAATATGAG TGCCTTGGAT CTATTAAAAC 3480 

CACATCGATG GTTCTTTCTA CTTGTTATAA ACTTGTAGCT TAATTCAGCA TTGGGTGAGG 3540 

TAATAAACCT TAGGAACTAG CATATAATTC TATATTGTAT TTCTCACAAC AATGGCTACC 3600 

TAAAAAGATG ACCCATTATG TCCTAGTTAA TCATCATTTT TCCTTTAGTT TAATTTTATA 3660 

AACAAAACTG ATTATACCAG TATAAAAGCT ACTTTGCTCC TGGTGAGAGC TTAAAAGAAA 3720 

TGGGCTGTTT TGCCCAAAGT TTTATTTTTT TTAAACAATG ATTAAATTGA ATGTGTAATG 3780 

TGCAAAAGCC CTGGAACGCA ATTAAATACA CTAGTAAGGA GTTCATTTTA TGAAGATATT 3840 

TGCTTTAATA ATGTCTTTTT AAAAATACTG GCACCAAAAG AAATAGATCC AGATCTACTT 3900 

GGTTGTCAAG TGGACAATCA AATGATAAAC TTTAAGACCT TGTATACCAT ATTGAAAGGA 3960 

AGAGGCTGAC AATAAGGTTT GACAGAGGGG AACAGAAGAA AATAATATGA TTTATTAGCA 4020 

CAACGTGGTA CTATTTGCCA TTTAAAACTA GAACAGGTAT ATAAGCTAAT ATTGATACAA 4080 

TGATGATTAA CTATGAATTC TTAAGACTTG CATTTAAATG TGACATTCTT AAAAAAAGAA 4140 

GAGAAAGAAT TTTAAGAGTA GCAGTATATA TGTCTGTGCT CCCTAAAAGT TGTACTTCAT 4200 

TTCTTTTCCA TACACTGTGT GCTATTTGTG TTAACATGGA AGAGGATTCA TTGTTTTTAT 4260 

TTTTATTTTT TTAATTTTTT CTTTTTTATT AAGCTAGCAT CTGCCCCAGT TGGTGTTCAA 4320 

ATAGCACTTG ACTCTGCCTG TGATATCTGT ATCTTTTCTC TAATCAGAGA TACAGAGGTT 4380 

GAGTATAAAA TAAACCTGCT CAGATAGGAC AATTAAGTGC ACTGTACAAT TTTCCCAGTT 4440 

TACAGGTCTA TACTTAAGGG AAAAGTTGCA AGAATGCTGA AAAAAAATTG AACACAATCT 4500 

CATTGAGGAG CATTTTTTAA AAACTAAAAA AAAAAAAACT TTGCCAGCCA TTTACTTGAC 4560 

TATTGAGCTT ACTTACTTGG ACGCAACATT GCAAGCGCTG TGAATGGAAA CAGAATACAC 4620 

TTAACATAGA AATGAATGAT TGCTTTCGCT TCTACAGTGC AAGGATTTTT TTGTACAAAA 4680 

CTTTTTTAAA TATAAATGTT AAGAAAAATT TTTTTTAAAA AACACTTCAT TATGTTTAGG 4740 

GGGGAACTGC ATTTTAGGGT TCCATTGTCT TGGTGGTGTT ACAAGACTTG TTATCCATTT 4800 

AAAAATGGTA GTGGAAATTC TATGCCTTGG ATACACACCG CTCTTCAGGT TGTAAAAAAA 4860 

AAAAACATAC ATTGGGGAAA GGTTTAAGAT TATATAGTAC TTAAATATAG GAAAATGCAC 4920 

ACTCATGTTG ATTCCTATGC TAAAATACAT TTATGGTCTT TTTTCTGTAT TTCTAGAATG 4980 

GTATTTGAAT TAAATGTTCA TCTAGTGTTA GGCACTATAG TATTTATATT GAAGCTTGTA 5040 

TTTTTAACTG TTGCTTGTTC TCTTAAAAGG TATCAATGTA CCTTTTTTGG TAGTGGAAAA 5100 

AAAAAAGACA GGCTGCCACA GTATATTTTT TTAATTTGGC AGGATAATAT AGTGCAAATT 5160 

ATTTGTATGC TTCAAAAAAA AAAAAAAGAG AGAAACAAAA AAGTGTGACA TTACAGATGA 5220 

GAAGCCATAT AATGGCGGTT TGGGGGAGCC TGCTAGAATG TCACATGGAT GGCTGTCATA 5280 

GGGGTTGTAC ATATCCTTTT TTGTTCCTTT TTCCTGCTGC CATACTGTAT GCAGTACTGC 5340 

AAGCTAATAA CGTTGGTTTG TTATGTAGTG TGCTTTTTGT CCCTTTCCTT CTATCACCCT 5400 

ACATTCCAGC ATCTTACCTT CATATGCAGT AAAAGAAAGA AAGAAAAAAA AAGGAAAAAA 5460 

AAAAAAAAAC CAATGTTTTG CAGTTTTTTT CATTGCCAAA AACTAAATGG TGCTTTATAT 5520 

TTAGATTGGA AAGAATTTCA TATGCAAAGC ATATTAAAGA GAAAGCCCGC TTTAGTCAAT 5580 

ACTTTTTTGT AAATGGCAAT GCAGAATATT TTGTTATTGG CCTTTTCTAT TCCTGTAATG 5640 

AAAGCTGTTT GTCGTAACTT GAAATTTTAT CTTTTACTAT GGGAGTCACT ATTTATTATT 5700 

GCTTATGTGC CCTGTTCAAA ACAGAGGCAC TTAATTTGAT CTTTTATTTT TCTTTGTTTT 5760 

TATTTTTTTT TTTATTTAGA TGACCAAAGG TCATTACAAC CTGGCTTTTT ATTGTATTTG 5820 

TTTCTGGTCT TTGTTAAGTT CTATTGGAAA AACCACTGTC TGTGTTTTTT TGGCAGTTX3T 5880 

CTGCATTAAC CTGTTCATAC ACCCATTTTG TCCCTTTATT GAAAAAATAA AAAAAATTAA - 5940 
A 

Seq ID NO: 85 Protein sequence i 
Protein Accession #; NP_075044.1 

1 11 21 31 41 51 

1 I I I I I 

MSRRKQGKPQ HLSKREFSPE PLEAILTDDE PDHGPLGAPE GDHDLLTCGQ CQMNFPLGDI 60 

LIFIEHKRKQ CNGSLCLEKA VDKPPSPSPI EMKKASNPVE VGIQVTPEDD DCLSTSSRRI 120 

CPKQEHIADK LLHWRGLSSP RSAHGALIPT PGMSAEYAPQ GICKDEPSSY TCTTCKQPFT 180 

SAWFLLQHAQ NTHGLRIYLE SEHGSPIiTPR VGIPSGLGAE CPSQPPLHGI HIADNNPFNL 240 

LRIPGSVSRE ASGLAEGRFP PTPPLFSPPP RHHLDPHRIE RLGAEEMALA THHPSAFDRV 300 

LRLNPMAMEP PAMDFSRRLR ELAGNTSSPP LSPGRPSPMQ RLLQPFQPGS KPPFLATPPL 360 

PPLQSAPPPS QPPVKSKSCE FCGKTFKFQS NLWHRRSHT GEKPYKCNLC DHACTQASKL 420 

KRHMKTHMHK SSPMTVKSDD GLSTASSPEP GTSDLVGSAS SAUCSWAKF KSENDPNLIP 480 

ENGDEEEEED DEEEEEEEEE EEEELTESER VDYGFGLSLE AARHHENSSR GAWGVGDES 540 

RALPDVMQGM VLSSMQHFSE AFHQVLGEKH KRGHLAEAEG HRDTCDEDSV AGESDRIDDG 60 0 

TVNGRGCSPG ESASGGLSKK LLLGSPSSLS PFSKR1KLEK EFDLPPATMP NTENVYSQWL 660 

AGYAASRQLK DPFLSFGDSR QSPFASSSEH SSENGSLRFS TPPGELDGGI SGRSGTGSGG 720 

STPHISGPGT GRPSSKEGRR SDTCEYCGKV FKNCSNLTVH RRSHTGERPY KCELCNYACA 780 
QSSKLTRHMK THGQVGKDVY KCEICKMPFS VYSTLEKHMK KWHSDRVLNN DIKTE 



Seq ID NO: 86 DNA sequence 

Nucleic Acid Accession #: XM_035292.2 

Coding sequence: 53-1576 

1 11 21 31 41 51 

III III 

GCTCGCTGGG CCGCGGCTCC CGGGTGTCCC AGGCCCGGCC GGTGCGCAGA GCATGGCGGG 60 

TGCGGGCCCG AAGCGGCGCG CGCTAGCGGC GCCGGCGGCC GAGGAGAAGG AAGAGGCGCG 120 

GGAGAAGATG CTGGCCGCCA AGAGCGCGGA CGGCTCGGCG CCGGCAGGCG AGGGCGAGGG 180 

CGTGACCCTG CAGCGGAACA TCACGCTGCT CAACGGCGTG GCCATCATCG TGGGGACCAT 240 

TATCGGCTCG GGCATCTTCG TGACGCCCAC GGGOGTGCTC AAGGAGGCAG GCTCGCCGGG 300 

GCTGGCGCTG GTGGTGTGGG CCGCGTGCGG CGTCTTCTCC ATCGTGGGCG CGCTCTGCTA 360 

CGCGGAGCTC GGCACCACCA TCTCCAAATC GGGCGGCGAC TACGCCTACA TGCTGGAGGT 420 

CTACGGCTCG CTGCCCGCCT TCCTCAAGCT CTGGATCGAG CTGCTCATCA TCCGGCCTTC 480 

ATCGCAGTAC ATCGTGGCCC TGGTCTTCGC CACCTACCTG CTCAAGCCGC TCTTCCCCAC 540 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



CTGCCCGGTG 
GGCCGTGAAC 
CAAGCTCCTG 
TGTGTCCAAT 
TGTGCTGGCA 
CACAGAGGAA 
CATCGTGACG 
GCAGATGCTG 
GTCCTGGATC 
GTTCACATCC 
CTCCATGATC 
GACGCTGCTC 
CAACTGGCTC 
TGAGCTTGAG 
CCTCTTCCTG 
CATCATCCTC 
GTGGCTCCTC 
CCCCCAGGAG 



CCCGAGGAGG 
TGCTACAGCG 
GCCCTGGCCC 
CTAGATCCCA 
TTATACAGCG 
ATGATCAACC 
CTGGTGTACG 
TCGTCCGAGG 
ATCCCCGTCT 
TCCAGGCTCT 
CACCCACAGC 
TACGCCTTCT 
TGCGTGGCCC 
CGGCCCATCA 
ATCGCCGTCT 
AGCGGGCTGC 
CAGGGCATCT 
ACATAGCCAG 



CAGCCAAGCT 
TGAAGGCCGC 
TGATCATCCT 
ACTTCTCATT 
GCCTCTTTGC 
CCTACAGAAA 
TGCTGACCAA 
CCGTGGCCGT 
TCGTGGGCCT 
TCTTCGTGGG 
TCCTCACCCC 
CCAAGGACAT 
TGGCCATCAT 
AGGTGAACCT 
CCTTCTGGAA 
CCGTCTACTT 
TCTCCACGAC 
GAGGCCGAGT 



CGTGGCCTGC 
CACCCGGGTC 
GCTGGGCTTC 
TGAAGGCACC 
CTATGGAGGA 
CCTGCCCCTG 
CCTGGCCTAC 
GGACTTOGGG 
GTCCTGCTTC 
GTCCCGGGAA 
CGTGOCGTCC 
CTTCTCOGTC 
CGGCATGATC 
GGCCCTGCCT 
GACACCOGTG 
CTTCGGGGTC 
CGTCCTGTGT 
GGCTGCCGGA 



Seq ID NO: 87 Protein sequence: 
Protein Accession #t XP_035292.2 



MAGAGPKRRA 
GTIIGSGIFV 
LEVYGSLPAF 
LLTAVNCYSV 
GNIVLALYSG 
STEQMLSSEA 
SILSMIHPQL 
RKPELERPIK 
KPKWLLQGIF 



11 
I 

LAAPAAEEKE 
TPTGVLKEAG 
LKLWIELIiII 
KAATRVQDAF 
LFAYGGWNYL 
VAVDFGNYHL 
LTPVPSLVFT 
VNLALPVFFI 
STTVLCQKLM 



21 

I 

EAREKMLAAX 
SPGIiALWWA 
RPSSQYIVAL 
AAAKLLALAL 
NFVTEEMINP 
GVMSWIIPVF 
CVMTLLYAFS 
LACLFIilAVS 
QWPQET 



31 

I 

SADGSAPAGE 
ACGVFSIVGA 
VFATYLLKPL 
IILLGFVQIG 
YRNLPLAI I I 
VGLSCFGSVN 
KDIFSVINFF 
,FWKTPVECGI 



CTCTGOGTGC 
CAGGATGCCT 
GTCCAGATCG 
AAACTGGATG 
TGGAATTACT 
GCCATCATCA 
TTCACCACCC 
AACTATCACC 
GGCTCCGTCA 
GGCCACCTGC 
CTCGTGTTCA 
ATCAACTTCT 
TGGCTGCGCC 
GTGTTCTTCA 
GAGTGTGGCA 
TGGTGGAAAA 
CAGAAGCTCA 
GGAGCATGC 



41 
I 

GEGVTLQRNI 
LCYAELGTTI 
FPTCPVPEEA 
KGDVSNLDPN 
SLPIVTLVYV 
GSLFTSSRLF 
SFFNWLCVAL 
GFTIILSGIjP 



TGCTGCTCAC 
TTGCCGCCGC 
GGAAGGGTGA 
TGGGGAACAT 
TGAATTTCGT 
TCTCCCTGCC 
TGTCCACCGA 
TGGGCX3TCAT 
ATGGGTCCCT 
CCTCCATCCT 
CGTGTGTGAT 
TCAGCTTCTT 
ACAGAAAGCC 
TCCTGGCCTG 
TCGGCTTCAC 
ACAAGCCCAA 
TGCAGGTGGT 



51 
I 

TLLNGVAI IV 
SKSGGDYAYM 
AKLVACLCVL 
FSFEGTKLDV 
LTNLAYFTTL 
FVGSREGHLP 
AIIGMIWLRH 
VYFFGVWWKN 



Seq ID NO: 88 DNA sequence 

Nucleic Acid Accession #: NM_005268.1 

Coding sequence: 168-989 



TAAAAAGCAA 
TCTGGATATG 
AGCCCTGAGG 
TCTTTGAGGG 
TGTCTCTGGT 
GTGATGACCA 
TTGATGAGTT 
CATGCCCCTC 
ACCGAGAAGC 
GTGGGCTCTG 
TTCTCTATGT 
ACGCAGATCC 
TTTTCACCCT 
TCATCTACCT 
TGTGCACAGG 
CGGGTGACCT 
GAGACCATGT 
CCTGGATGGG 
CATGAGGTAG 
TCAACTCCAG 
vGCTCGGTTTC 



11 
I 

AAGAATTCGC 
AAATTCAAGC 
AGTAGTCACT 
ACTCCTGAGT 
CTTCATCTTC 
CAAGGACTTC 
CTTCCCTGTG 
ACTGCTCGTG 
CCATGGGGAG 
GTGGACATAT 
GTTCCACTCA 
ATGTCCCAAT 
CTTCATGGTG 
GGTGAGCAAG 
TCATCACCCC 
CATCTTTCTG 
GAAGAAAACC 
GAGGCTCTAG 
GGGCAGGCAA 
CCACCTGCCC 
CTTTTCTAGA 



21 
I 

GGCCGCGTCG 
TGCTTGCTGA 
CAGTAGCAGC 
GGGGTCAACA 
CGCGTGCTGG 
GACTGCAATA 
TCCCATGTGC 
GTCATGCACG 
AACAGTGGGC 
GTCTGCAGCC 
TTCTACCCCA 
ATAGTGGACT 
GCCACAGCTG 
AGATGCCACG 
CACGGTACCA 
GGCTCAGACA 
ATCTTGTGAG 
CATCTCTCAT 
GAGAGAGGAT 
CAGCTCGACG 
ATGGAAATAG 



31 
I 

ACACGGGCTT 
GTCCTATTGC 
TGACGCGTGG 
AGTACTCCAC 
TGTACCTGGT 
CTCGCCAGCC 
GCCTCTGGGC 
TGGCCTACCG 
GCCTCTACCT 
TAGTGTTCAA 
AATATATCCT 
GCTTCATCTC 
CCATCTGCAT 
AGTGCCTGGC 
CCTCTTCCTG 
GTCATCCTCC 
GGGCTGCCTG 
AGGTGCAACC 
TCAGACGCTC 
GCACTGGGCC 
TGAGGGCCAA 



41 
I 

CCCCGAAAAC 
CGGCTGCTGG 
GTCCACCATG 
AGCCTTTGGG 
GACGGCCGAG 
CGGCTGCTCC 
CCTGCAGCTT 
GGAGGTTCAG 
GAACCCCGGC 
GGCGAGCGTG 
CCCTCCTGTG 
CAAGCCCTCA 
CCTGCTCAAC 
AGCAAGGAAA 
CAAACAAGAC 
TCTCTTACCA 
GACTGGTCTG 
TGAGAGTGGG 
TGGGAGCCAG 
AGTTCCCCCT 
TGC 



51 

I 

CTTCCCCGCT 
GAGCCAGGAG 
AACTGGAGTA 
CGCATCTGGC 
CGTGTGTGGA 
AACGTCTGCT 
ATCCTGGTGA 
GAGAAGAGGC 
AAGAAGCGGG 
GACATCGCCT 
GTCAAGTGCC 
GAGAAGAACA 
CTCGTGGAGC 
GCTCAAGCCA 
GACCTCCTTT 
GACCGCCCCC 
GCAGGTTGGG 
GGAGCTAAGC 
TTCCTAGTCC 
CTGCTCTGCA 



Seq ID NO: 89 Protein sequence: 
Protein Accession fh NP_005259.1 



1 
I 

MNWSIFEGLL 
SNVCFDEFFP 
GKKRGGLWWT 
SEKNIFTLFM 
DDLLSGDLIF 



11 
I 

SGVNKYSTAF 
VSHVRLWAIrQ 
YVCSLVFKAS 
VATAAICILL 
LGSDSHPPLL 



21 



31 



41 



51 



GRIWLSLVFI FRVLVYLVTA ERVWSDDHKD FDCNTRQPGC 
LILVTCPSLL WMHVAYREV QEKRHREAHG ENSGRLYLNP 
VDIAFLYVFH SFYPKYIDPP WKCHADPCP NIVDCFISKP 
NIiVELIYLVS KRCHECLAAR KAQAMCTGHH PHGTTSSCKQ 
PDRPRDHVKK TIL 



Seq ID NO: 90 DNA sequence 

Nucleic Acid Accession 8: NM_002391. 

Coding sequence: 26-457 



CGGGCGAAGC 
CGCCCTGCTG 
CCCGGGGAGC 
CGGCGTGGGT 
GCCCTGCAAC 
TGCGTGTGAT 



11 
I 

AGCGCGGGCA 
GCGCTCACCT 
GAGTGCGCTG 
TTCCGCGAGG 
TGGAAGAAGG 
GGGGGCACAG 



21 
I 

GCGAGATGCA 
CCGCGGTCGC 
AGTGGGCCTG 
GCACCTGCGG 
AGTTTGGAGC 
GCACCAAAGT 



31 



41 



51 

I 



600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



60 
120 
160 
240 
300 
360 
420 
480 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



60 
120 
180 
240 



GCACCGAGGC TTCCTCCTCC TCACCCTCCT 
CAAAAAGAAA GATAAGGTGA AGAAGGGCGG 
GGGGCCCTGC ACCCCCAGCA GCAAGGATTG 
GGCCCAGACC CAGCGCATCC GGTGCAGGGT 
CGACTGCAAG TACAAGTTTG AGAACTGGGG 
CCGCCAAGGC ACCCTGAAGA AGGCGCGCTA 



60 
120 
180 
240 
300 
360 



222 
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CAATGCTCAG TGCCAGGAGA CCATCCGCGT CACCAAGCCC TGCACCCCCA AGACCAAAGC 420 
AAAGGCCAAA GCCAAGAAAG GGAAGGGAAA GGACTAGACG CCAAGCCTGG ATGCCAAGGA 480 
GCCCCTGGTG TCACATGGGG CCTGGCCACG CCCTCCCTCT CCCAGGCCCG AGATGTGACC 540 
CACCAGTGCC TTCTGTCTGC TCGTTAGCTT TAATCAATCA TGCCCTGCCT TGTCCCTCTC 600 
5 ACTCCCCAGC CCCACCCCTA AGTGCCCAAA GTGGGGAGGG ACAAGGGATT CTGGGAAGCT 660 
TGAGCCTCCC CCAAAGCAAT GTGAGTCCCA GAGCCCGCTT TTGTTCTTOC CCACAATTCC 720 
ATTACTAAGA AACACATCAA ATAAACTGAC TTTTTCCCCC CAATAAAAGC TCTTCTTTTT 780 
TAATAT 

10 Seq ID NO: 91 Protein sequence: 
Protein Accession 8: NP_0023B2.1 

1 11 21 31 41 51 

1C I I I I I I 

15 MQHRGFLLLT LLALLALTSA VAKKKDKVKK GGPGSECAEW AWGPCTPSSK DCGVGFREGT 60 
OGAQTQRIRC RVPCNWKKEF GADCKYKFEN WGACDGGTGT KVRQGTLKXA RYNAQCQETI 120 
RVTKPCTPKT KAKAKAKKGK GKD 
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20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO: 92 DNA sequence 

Nucleic Acid Accession &: NM_005130.1 

Coding sequence.* 98-B02 



CTCTACCTGA 
CGTGTGCTCA 
GCTCTCCTTC 
GAATGGACTT 
TAAGCAGAAA 
CAGATGGGCT 
GGACCATGAA 
TGAGAGAGTC 
ATATTCCAAG 
TAAGCTAGTC 
GTCCCCCAGG 
GACCATGGCC 
GACTGCCCTG 
AGTGCAGGAC 
TGTCGTAAGT 
TGTGCTTAGT 
TGGAATTTGC 
TTCCATGGCC 
GAGTGATAAT 
TTTTTCAAAA 



11 
I 

CACAGCTGCA 
GAACAAGGTG 
CTCCTACTGG 
CACAGCAAAG 
AGCAGGCCCG 
GCTACTGAGC 
TTTTCCTGTG 
TATTGGAAAC 
ACAGCTGTGA 
AGCTCCACTC 
GAGCACATCA 
ACCAAAGCTC 
GAGTTCTGTG 
ACGTCATGCT 
CCCTCTGTAT 
GAGTGCAACG 
CTTATTTTTC 
CACACAGCTA 
TTCAGTGCAA 
AAAAAAAAAA 



21 
I 

GCCTGCAATT 
AACGCCCAGC 
CTGCTCAGGT 
TGGTCTCAGA 
GGAACAAAGG 
AGGAGGAGGG 
TCTTTGCTGG 
AAGTTGCCCG 
AAACCAGAGT 
TATTTGGGAA 
AGGGCAAAGA 
CCGAGTGTGT 
GAGAGACTTG 
AATGAGGTCA 
ACTTTAAAGC 
AAATATTTAA 
TTGGATGCGA 
TGTGTTTGAG 
CGAACTTTCT 
AAA 



31 
I 

CACTCCCACT 
TGCAGCCATG 
GCTCCTGGTG 
ACAAAAGGAC 
CAAGTTTGTC 
CATCTCTCTC 
CAATCCAACC 
GAATCTGCGC 
GTGCAGAAAG 
CACAAAGCCC 
GACCACCCCC 
GGAGGACCCA 
GAGCTCTCTC 
AAAGAGAACG 
TCTCTACAGT 
ACAAGTTTTG 
TGTTCAGAGG 
CAGCGAAGAG 
GCTGAATTAA 



41 
I 

GCCTGGGATT 
AAGATCTGTA 
GAGGGGAAAA 
ACTCTGGGCA 
ACCAAAGACC 
AAGGTTGAGT 
TCATGCCTAA 
TCACAGAAAG 
GATTTTCCAG 
AGGAAGGAGA 
TCTAGCCTAG 
GATATGGCAA 
TGCACATTCT 
GGTTCCTTTA 
CCCCCCAAAA 
TATTTTTTGC 
CTGTTTCCTG 
TCTTTGAGCT 
TGGTAATAAA 



51 
I 

GCACTGGATC 
GCCTCACCCT 
AAAAAGTGAA 
ACACCCAGAT 
AAGCCAACTG 
GCACTCAATT 
AGCTCAAGGA 
ACATCTGTAG 
AATCCAGTCT 
AAACAGAGAT 
CAGTGACCCA 
ACCAGAGGAA 
TCCTCAGCAT 
AGAGATGTCA 
TATGAACTTT 
TTTTGTGTTT 
CAGCATGTAT 
GAATGAGCCA 
ACTCTGGGTG 



Seq ID NO: 93 Protein sequence: 
Protein Accession #: NP_00512l.l 

1 11 21 31 41 51 

I I I I I I 

MKICSLTLLS FLLLAAQVLL VEGKKKVKNG LHSKWSEQK DTLGNTQIKQ KSRPGNKGKF 
VTKDQANCRW AATEQEEGIS LKVECTQLDH EFSCVFAGNP TSCLKLKDER VYWKQVARNL 
RSQKDICRYS KTAVKTRVCR KDFPESSLKL VSSTLFGNTK PRKEKTEMSP REHIKGKETT 
PSSLAVTQTM ATKAPECVED PDMANQRKTA LEFCGETWSS LCTFFLSIVQ DTSC 

Seq ID NO: 94 DNA sequence 

Nucleic Acid Accession #: NM_012101 

Coding sequence: 125-1891 



CTCCTCACAG 
TGCCAGAAAG 
TGCGATGGAA 
CCGGAGCCCG 
TGCCAAGACC 
CCTGAAGCCA 
CATCCAGTTT 
GGAAGGCAAG 
TACCTTTGCC 
GGTGTCCATC 
CCTTTTTTCA 
CAAGCAGAAG 
CAAGCCCCAC 
CTTTGAGGCC 
CCAGACCTGC 
AGTGGAGGAG 
GCTCAAGATC 
CAAGAGCTTC 
GGACCTGGAG 
TGTGGACCAA 
GGACAAGCAG 
ATTTGGTGCA 
GCTGGAGGGG 
ATGCATGCGC 
GAACCACATG 



11 
I 

GTGTGTCTCT 
GTCACCTATC 
GCTGCAGATG 
TCGGGCCCCA 
ACCAACGGGC 
GGGGAAGGTA 
GTCGAGTCCG 
AGGTCGCCGT 
GAAAAGGGCG 
ATGGAGCCCG 
CGGTCCAAGT 
GCGGTCAAGT 
CTGGAGGGCG 
CGCAAGTGTC 
ATCTGCTACC 
GCCAAGGCCG 
ATTGAGATTG 
ACCACCAATG 
AAGCAAAAGG 
GTGAAGGTGA 
ACCCGGGAGC 
TTGATGAGCA 
GAGGGCCTGG 
CACGTTGAGA 
GAGAACGGTG 



21 

I 

AGTCCTCGTG 
CTGAACCCCA 
CCTCCAGGAG 
GTGGCAGCCT 
ACGGCGGGGA 
GGAGCGCCCT 
GGGACGACAA 
ACGCAGGGCT 
ACGTGCGCAA 
GGGAGACCCG 
CCGGCTCCGA 
CCTGCCTGGT 
CCGCCTTCCG 
CCGTGCATGG 
TTTGCATGTT 
AGAAGGAGAC 
AGGATGAAGC 
AGAAGGCCAT 
AGGAAGTGAG 
TCATGGATGC 
AGCTGCATAG 
ATTACTCTCT 
GACAGTCACT 
AGATGTGCAA 
GTGACCATCG 



31 
I 

GTTGCCTGCC 
GCAAGCCTGA 
CAACGGGTCG 
GGAGAATGGC 
GGCAGCTGAG 
GTTCGCGGGC 
GAACTCCAAC 
CCAGCTGGGG 
GTCCATTTTC 
GCGGAACAGC 
GGAGGTGCTG 
GTGCCAGGCC 
AGACCACCAG 
CAAGACGATG 
CCAGGAGCAC 
GGAGCTGTCA 
TGAGAAGTGG 
CCTGGAGCAG 
GGCTGCGCTG 
TCTGGATGAG 
CATCAGCGAC 
CCCCCCACCC 
AGGCAACTTC 
GGCGGACCTG 
CTATGTGAAC 



41 
I 

CCACTCCCTG 
AACAGCTCAG 
AGCCCAGAAG 
ACCAAGGCTG 
GGCAAGAGCC 
AATGAGTGGC 
TACTTCAGCA 
GCTGCCAAGA 
TCGGAGTCCC 
TACCCCCGGG 
TGCGACTCCT 
TCCTTCTGCG 
CTGCTCGAGC 
GAGCTCTTCT 
AAGAATCATA 
CTGCAAAAGG 
CAGAAGGAGA 
AACTTCCGGG 
GAGCAGCGGG 
AGAGCCAAGG 
TCTGTGTTGT 
CTGCCCACCT 
AAGGACGACC 
AGCCGTAACT 
AACTACACGA 



51 

I 

CCGAGACGCC 
CCAAGCACCC 
CCAGGGATGC 
ACGGCAAGGA 
TGGGCAGCGC 
GGCGACCCAT 
TGGACTCTAT 
AGCCACCCGT 
GGAAGCCCAC 
CCGACACGGG 
GCATCGGCAA 
AGCTGCATCT 
CCATCCGGGA 
GCCAGACCGA 
GCACCGTGAC 
AGCAGCTGCA 
AGGACCGCAT 
ACCTGGTGCG 
AGCAGGATGC 
TGCTGCATGA 
TTCTGCAGGA 
ATCATGTCCT 
TGCTCAATGT 
TCATTGAGAG 
ACAGCTTCGG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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GGGTGAGTGG AGTGCACCGG ACACCATGAA GAGATACTCC ATGTACCTGA CACCCAAAGG 1560 

TGGGGTCCGG ACATCATACC AGCCCTCGTC TCCTGGCOGC TTCACCAAGG AGACCACCCA 1620 

GAAGAATTTC AACAATCTCT ATGGCACCAA AGGTAACTAC ACCTCCCGGG TCTGGGAGTA 1680 

CTCCTCCAGC ATTCAGAACT CTGACAATGA CCTGCCCGTC GTCCAAGGCA GCTCCTCCTT 1740 

CTCCCTGAAA GGCTATCCCT CCCTCATGCG GAGCCAAAGC CCCAAGGCCC AGCCCCAGAC 1800 

TTGGAAATCT GGCAAGCAGA CTATGCTGTC TCACTACCGG CCATTCTACG TCAACAAAGG 1860 

CAACGGGATT GGGTCCAACG AAGCCCCATG AGCTCCTGGC GGAAGGAACG AGGCGCCACA 1920 

CCCCTGCTCT TCCTCCTGAC CCTGCTGCTC TTGCCTTCTA AGCTACTGTG CTTGTCTGGG 1980 

TGGGAGGGAG CCTGGTCCTG CACCTGCCCT CTGCAGCCCT CTGCCAGCCT CTTGGGGGCA 2040 

GTTCCGGCCT CTCCGACTTC CCCACTGGCC ACACTCCATT CAGACTCCTT TCCTGCCTTG 2100 

TGACCTCAGA TGGTCACCAT CATTCCTGTG CTCAGAGGCC AACCCATCAC AGGGGTGAGA 2160 

TAGGTTGGGG CCTGCCCTAA CCCGCCAGCC TCCTCCTCTC GGGCTGGATC TGGGGGCTAG 2220 

CAGTGAGTAC CCGCATGGTA TCAGCCTGCC TCTCCCGCCC ACGCCCTGCT GTCTCCAGGC 2280 

CTATAGACGT TTCTCTCCAA GGCCCTATCC CCCAATGTTG TCAGCAGATG CCTGGACAGC 2340 

ACAGCCACCC ATCTCCCATT CACATGGCCC ACCTCCTGCT TCCCAGAGGA CTGGCCCTAC 2400 

GTGCTCTCTC TCGTCCTACC TATCAATGCC CAGCATGGCA GAACCTGCAG TGGCCAAGGG 2460 

CTGCAGATGG AAACCTCTCA GTGTCTTGAC ATCACCCTAC CCAGGCGGTG GGTCTCCACC 2520 

ACAGCCACTT TGAGTCTGTG GTCCCTGGAG GGTGGCTTCT CCTGACTGGC AGGATGACCT 2580 

TAGCCAAGAT ATTCCTCTGT TCCCTCTGCT GAGATAAAGA ATTCCCTTAA CATGATATAA 2640 

TCCACCCATG CAAATAGCTA CTGGCCCAGC TACCATTTAC CATTTGCCTA CAGAATTTCA 2700 

TTCAGTCTAC ACTTTGGCAT TCTCTCTGGC GATGGAGTGT GGCTGGGCTG ACCGCAAAAG 2760 

GTGCCTTACA CACTGCCCCC ACCCTCAGCC GTTGCCCCAT CAGAGGCTGC CTCCTCCTTC 2820 

TGATTACCCC CCATGTTGCA TATCAGGGTG CTCAAGGATT GGAGAGGAGA CAAAACCAGG 2880 

AGCAGCACAG TGGGGACATC TCCCGTCTCA ACAGCCCCAG GCCTATGGGG GCTCTGGAAG 2940 

GATGGGCCAG CTTGCAGGGG TTGGGGAGGG AGACATCCAG CTTGGGCTTT CCCCTTTGGA 3000 
ATAAACCATT GGTCTGTC 

Seq ID NO: 95 Protein sequence: 
Protein Accession #: NP_03 6233.1 



1 11 21 31 41 51 

MEAADASRSN GSSPEARDAR SPSGPSGSLE NGTKADGKDA KTTNGHGGEA AEGKSLGSAL 60 

KPGEGRSALF AGNEWRRPII QFVESGDDKN SNYFSMDSME GKRSPYAGLQ LGAAKKPPVT 120 

FAEKGDVRKS IFSESRKPTV SIMEPGETRR NSYPRADTGL FSRSKSGSEE VLCDSCIGNK 180 

QKAVKSCLVC QASFCELHLK PHLEGAAFRD HQLLEPIRDF EARKCPVHGK TMELFCQTDQ 240 

TCICYLCMFQ EHKNHSTVTV EEAKAEKETE LSLQKEQLQL KIIEIEDEAE KWQKEKDRIK 300 

SFTTNEKAIL EQNFRDLVRD LEKQKEEVRA ALEQREQDAV DQVKVIMDAL DERAKVLHED 360 

KQTREQLHSI SDSVLFLQEF GALMSNYSIiP PPLPTYHVLL EGEGLGQSLG NFKDDLLNVC 420 

MRHVEKMCKA DLSRNFIERN HMENGGDHRY VNNYTNSFGG EWSAPDTMKR YSMYLTPKGG 480 

VRTSYQPSSP GRFTKETTQK NFNNLYGTKG NYTSRVWEYS SSIQNSDNDL PWQGSSSFS 540 
LKGYPSLMRS QSPKAQPQTW KSGKQTMLSH YRPFYVNKGN GIGSNEAP 



Seq ID NO i 96 DNA sequence 

Nucleic Acid Accession #s NM_080668.1 

Coding sequence: 83-841 

1 11 21 31 41 51 

GGCACGAGGG CAGCGAGTGG CCTTCCCGGT TGGCGCGCGC CCGGGGCGGC GGCGCTGGAG 60 

GAGCTCGAGA CGGAGCCTAG TTATGTCTGG GAGGCGAACG CGGTCCGGAG GAGCCGCTCA 120 

GCGCTCCGGG CCAAGGGCCC CATCTCCTAC TAAGCCTCTG CGGAGGTCCC AGCGGAAATC 180 

AGGCTCTGAA CTCCCGAGCA TCCTCCCTGA AATCTGGCCG AAGACACCCA GTGCGGCTGC 240 

AGTCAGAAAG CCCATCGTCT TAAAGAGGAT CGTGGCCCAT GCTGTAGAGG TCCCAGCTGT 300 

CCAATCACCT CGCAGGAGCC CTAGGATTTC CTTTTTCTTG GAGAAAGAAA ACGAGCCCCC 360 

TGGCAGGGAG CTTACTAAGG AGGACCTTTT CAAGACACAC AGCGTCCCTG CCACCCCCAC 420 

CAGCACTCCT GTGCCGAACC CTGAGGCCGA GTCCAGCTCC AAGGAAGGAG AGCTGGACGC 480 

CAGAGACTTG GAAATGTCTA AGAAAGTCAG GCGTTCCTAC AGCCGGCTGG AGACCCTGGG 540 

CTCTGCCTCT ACCTCCACCC CAGGCCGCCG GTCCTGCTTT GGCTTCGAGG GGCTGCTGGG 600 

GGCAGAAGAC TTGTCCGGAG TCTCGCCAGT GGTGTGCTCC AAACTCACCG AGGTCCCCAG 660 

GGTTTGTGCA AAGCCCTGGG CCCCAGACAT GACTCTCCCT GGAATCTCCC CACCACCCGA 720 

GAAACAGAAA CGTAAGAAGA AGAAAATGCC AGAGATCTTG AAAACGGAGC TGGATGAGTG 780 

GGCTGCGGCC ATGAATGCCG AGTTTGAAGC TGCTGAGCAG TTTGATCTCC TGGTTGAATG 840 

AGATGCAGTG GGGGGTGCAC CTGGCCAGAC TCTCCCTCCT GTCCTGTACA TAGCCACCTC 900 

CCTGTGGAGA GGACACTTAG GGTCCCCTCC CCTGGTCTTG TTACCTGTGT GTGTGCTGGT 960 

GCTGCGCATG AGGACTGTCT GCCTTTGAGG GCTTGGGCAG CAGCGGCAGC CATCTTGGTT 1020 

TTAGGAAATG GGGCCGCCTG GCCCAGCCAC TCACTGGTGT CCTGTCTCTT GTCGTCCTGT 1080 ■ 

CCTTCCTATC TCCCCAAAGT ACCATAGCCA GTTTCCAGAT GGGCCACAGA CTGGGGAGGA 1140 

GAATCAGTGG CCCAGCCAGA AGTTAAAGGG CTGAGGGTTG AGGTGAGAGG CACCTCTGCT 1200 

CTTGTTGGGA GGGGTGGCTG CTTGGAAATA GGCCCAGGGG CTCTGCCAGC CTCGGCCTCT 1260 

CCCTCCTGAG TTGCCTTCTG TTGGTGGCTT TCTTCTTGAA CCCACCTGTG TAAAGAGGTT 1320 

TTCAGTTCCG TGGGTTTCCC CTTTGATTCT GTAAATAGTC CCAGAGAGAA TTCGTGGGCT 1380 

GAGGGCAATT CTGTCTTGGA GGAAGAAGCT GGACATTCAG CCTGTGGAGT CTGAGTTTTG 1440 

AAGGATGTAG GGAGCCTTAG TTGGGTCTCA GACCATAAGT GTGTACTACA CAGAAGCTGT 1500 

GTTTTCTAGT TCTGGTCTGC TGTTGAGATG TTTGGTAAAT GCCAGGTTGA TAGGGCGCTG 1560 

GCTGCTTGGA GCAAAGGGTG CATTTCAGGG TGTGGCCACC AGGTGCTGTG AGTTTCTGTG 1620 

GCTCATGGCC TCTGGGCTGG TCCCTTGCAC AGGGCCCACG CTGGAGTCTT ACCACTCTGC 1680 

TGCAGGGGTG GAAGGTGGCC CCTCTTGTCA CCCATACCCA TTTCTTACAA AATAAGTTAC 1740 

ACCGAGTCTA CTTGGCCCTA GAAGAGAAAG TTGAAGAGTC CCAGACCTAC TAGCATTTTG 1800 

CAACTATGCT TGTAAAGTCC TCGGAAAGTT TCCTCGCGTA CCAGACAGCG GCGGGGGCTG I860 

ATAGCAATTT TAGTTTTTGG CCTCCCTATC CTCTCACATG AGAACACTGC CTGGATGCAT 1920 

CTCATGATCT CTGGAGAATT TCCCCATCTT TCTCTTCTTT CCATCGTGTG GATTCAATAG 1980 

TTTGGATTTG AAGGCTGCCC TGCCCCCGAC TCTCCTGCCG CACCCCTGGC CATTGTACCT 2040 

TTTGATGTTT AGAAGTTCGT GGAAGTAGAC GCTGAGGTGT GCAGAGGAGC TGGTGGATAA 2100 

CAGAGAATGC CAGGGAAGAT GAGTGCTGGG TCAGGGTACT TGGATGAAAC GGTGCAGGCC 2160 

AGGCGGGCCC TAATAAAACC CTCTGCCAGG TCTGGGAGTC CCAGGCCATC TGCTCAACGC 2220 
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TCTGTGGTTT GTCAGACCTG CAAGCAAGCC CCCTGCTGGG GAAGCCTAGG TGTCCTTGAG 2280 

CTGAACCGCA CTGAAGAACT CTTGTCCTCA CTGGCTGATG CAGCAGAACT CTTGGGAAAT 2340 

GTCTTAGTCC TGCAGAATCA GGAGTCACCA GATGATGCAG AGTTGAGATC ATCATTGCAA 2400 

AGTTCTCTGT TCCTGAGGAA CTAAATTTAA GGAAAAAATG GGATTTTGTT TTAGAGTTGG 2460 
AAAAAAAGCC TGATTAAAGA GTTTCTGCCT GTTAAAAAAA AAAAAAAAAA AAAAAA 

Seq ID NO: 97 Protein sequence: 
Protein Accession #: NP_542399.1 

1 11 21 31 41 51 

I I I I I I 

MSGHRTRSGG AAQRSGPRAP SPTKPLRRSQ RKSGSELPSI LPEIWPKTPS AAAVRKPIVL 60 

KRIVAHAVEV PAVQSPRRSP RISFFLEKEN EPPGRELTKE DLFKTHSVPA TPTSTPVPNP 120 

EAESSSKEGE LDARDLEMSK KVRRSYSRLE TLGSASTSTP GRRSCFGPEG LLGAEDLSGV 180 

SPWCSKLTE VPRVCAKPWA PDMTLPGISP PPEKQKRKKK KMPEILKTEX DEWAAAMNAE 240 
FEAAEQFDLL VE 

Seq ID NO: 98 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 58-12444 

1 11 21 31 41 51 

I I I I I I 

GGGGCATTTC CGGGTCCGGG CCGAGCGGGC GCACGCGCGG GAGCGGGACT CGGCGGCATG 60 

GCGGGCTCCG GAGCCGGTGT GCGTTGCTCC CTGCTGCGGC TGCAGGAGAC CTTGTCCGCT 120 

GCGGACCGCT GCGGTGCTGC CCTGGCCGGT CATCAACTGA TCCGCGGCCT GGGGCAGGAA 180 

TGCGTCCTGA GCAGCAGCCC CGCGGTGCTG GCATTACAGA CATCTTTAGT TTTTTCCAGA 240 

GATTTOGGTT TGCTTGTATT TGTCCGGAAG TCACTCAACA GTATTGAATT TCGTGAATGT 300 

AGAGAAGAAA TCCTAAAGTT TTTATGTATT TTCTTAGAAA AAATGGGCCA GAAGATCGCA 360 

CCTTACTCTG TTGAAATTAA GAACACTTGT ACCAGTGTTT ATACAAAAGA TAGAGCTGCT 420 

AAATGTAAAA TTCCAGCCCT GGACCTTCTT ATTAAGTTAC TTCAGACTTT TAGAAGTTCT 480 

AGACTCATGG ATGAATTTAA AATTGGAGAA TTATTTAGTA AATTCTATGG AGAACTTGCA 540 

TTGAAAAAAA AAATACCAGA TACAGTTTTA GAAAAAGTAT ATGAGCTCCT AGGATTATTG 600 

GGTGAAGTTC ATCCTAGTGA GATGATAAAT AATGCAGAAA ACCTGTTCCG CGCTTTTCTG 660 

GGTGAACTTA AGACCCAGAT GACATCAGCA GTAAGAGAGC CCAAACTACC TGTTCTGGCA 720 

GGATGTCTGA AGGGGTTGTC CTCACTTCTG TGCAACTTCA CTAAGTCCAT GGAAGAAGAT 780 

CCCCAGACTT CAAGGGAGAT TTTTAATTTT GTACTAAAGG CAATTCGTCC TCAGATTGAT 840 

CTGAAGAGAT ATGCTGTGCC CTCAGCTGGC TTGCGCCTAT TTGCCCTGCA TGCATCTCAG 900 

TTTAGCACCT GCCTTCTGGA CAACTACGTG TCTCTATTTG AAGTCTTGTT AAAGTGGTGT 960 

GCCCACACAA ATGTAGAATT GAAAAAAGCT GCACTTTCAG CCCTGGAATC CTTTCTGAAA 1020 

CAGGTTTCTA ATATGGTGGC GAAAAATGCA GAAATGCATA AAAATAAACT GCAGTACTTT 1080 

ATGGAGCAGT TTTATGGAAT CATCAGAAAT GTGGATTCGA ACAACAAGGA GTTATCTATT 1140 

GCTATCCGTG GATATGGACT TTTTGCAGGA CCGTGCAAGG TTATAAACGC AAAAGATGTT 1200 

GACTTCATGT ACGTTGAGCT CATTCAGCGC TGCAAGCAGA TGTTCCTCAC CCAGACAGAC 1260 

ACTGGTGACG ACCGTGTTTA TCAGATGCCA AGCTTCCTCC AGTCTGTTGC AAGCGTCTTG 1320 

CTGTACCTTG ACACAGTTCC TGAGGTGTAT ACTCCAGTTC TGGAGCACCT CGTGGTGATG 1380 

CAGATAGACA GTTTCCCACA GTACAGTCCA AAAATGCAGC TGGTGTGTTG CAGAGCCATA 1440 

GTGAAGGTGT TCCTAGCTTT GGCAGCAAAA GGGCCAGTTC TCAGGAATTG CATTAGTACT 1500 

GTGGTGCATC AGGGTTTAAT CAGAATATGT TCTAAACCAG TGGTCCTTCC AAAGGGCCCT 1560 

GAGTCTGAAT CTGAAGACCA CCGTGCTTCA GGGGAAGTCA GAACTGGCAA ATGGAAGGTG 1620 

CCCACATACA AAGACTACGT GGATCTCTTC AGACATCTCC TGAGCTCTGA CCAGATGATG 1680 

GATTCTATTT TAGCAGATGA AGCATTTTTC TCTGTGAATT CCTCCAGTGA AAGTCTGAAT 1740 

CATTTACTTT ATGATGAATT TGTAAAATCC GTTTTGAAGA TTGTTGAGAA ATTGGATCTT 1800 

ACACTTGAAA TACAGACTGT TGGGGAACAA GAGAATGGAG ATGAGGCGCC TGGTGTTTGG 1860 

'ATGATCCCAA CTTCAGATCC AGCGGCTAAC TTGCATCCAG CTAAACCTAA AGATTTTTCG 1920 

GCTTTCATTA ACCTGGTGGA ATTTTGCAGA GAGATTCTCC CTGAGAAACA AGCAGAATTT 1980 

TTTGAACCAT GGGTGTACTC ATTTTCATAT GAATTAATTT TGCAATCTAC AAGGTTGCCC 2040 

CTCATCAGTG GTTTCTACAA ATTGCTTTCT ATTACAGTAA GAAATGCCAA GAAAATAAAA 2100 

TATTTCGAGG GAGTTAGTCC AAAGAGTCTG AAACACTCTC CTGAAGACCC AGAAAAGTAT 2160 

TCTTGCTTTG CTTTATTTGT GAAATTTGGC AAAGAGGTGG CAGTTAAAAT GAAGCAGTAC 2220 

AAAGATGAAC TTTTGGCCTC TTGTTTGACC TTTCTTCTGT CCTTGCCACA CAACATCATT 2280 

GAACTCGATG TTAGAGCCTA CGTTCCTGCA CTGCAGATGG CTTTCAAACT GGGCCTGAGC 2340 

TATACCCCCT TGGCAGAAGT AGGCCTGAAT GCTCTAGAAG AATGGTCAAT TTATATTGAC 2400 

AGACATGTAA TGCAGCCTTA TTACAAAGAC ATTCTCCCCT GCCTGGATGG ATACCTGAAG 2460 

ACTTCAGCCT TGTCAGATGA GACCAAGAAT AACTGGGAAG TGTCAGCTCT TTCTCGGGCT 2520 

GCCCAGAAAG GATTTAATAA AGTGGTGTTA AAGCATCTGA AGAAGACAAA GAACCTTTCA 2 5 B0 

TCAAACGAAG CAATATCCTT AGAAGAAATA AGAATTAGAG TAGTACAAAT GCTTGGATCT 2640 

- CTAGGAGGAC AAATAAACAA AAATCTTCTG ACAGTCACGT CCTCAGATGA GATGATGAAG 2700 

AGCTATGTGG CCTGGGACAG AGAGAAGCGG CTGAGCTTTG CAGTGCCCTT TAGAGAGATG 2760 

AAACCTGTCA TTTTCCTGGA TGTGTTCCTG CCTCGAGTCA CAGAATTAGC GCTCACAGCC 2820 

AGTGACAGAC AAACTAAAGT TGCAGCCTGT GAACTTTTAC ATAGCATGGT TATGTTTATG 2880 

TTGGGCAAAG CCACGCAGAT GCCAGAAGGG GGACAGGGAG CCCCACCCAT GTACCAGCTC 2940 

TATAAGCGGA CGTTTCCTGT GCTGCTTCGA CTTGCGTGTG ATGTTGATCA GGTGACAAGG 3000 

CAACTGTATG AGCCACTAGT TATGCAGCTG ATTCACTGGT TCACTAACAA CAAGAAATTT 3060 

GAAAGTCAGG ATACTGTTGC CTTACTAGAA GCTATATTGG ATGGAATTGT GGACCCTGTT 3120 

GACAGTACTT TAAGAGATTT TTGTGGTCGG TGTATTCGAG AATTCCTTAA ATGGTCCATT 3180 

AAGCAAATAA CACCACAGCA GCAGGAGAAG AGTCCAGTAA ACACCAAATC GCTTTTCAAG 3240 

CGACTTTATA GCCTTGCGCT TCACCCCAAT GCTTTCAAGA GGCTGGGAGC ATCACTTGCC 3300 

TTTAATAATA TCTACAGGGA ATTCAGGGAA GAAGAGTCTC TGGTGGAACA GTTTGTGTTT 3360 

GAAGCCTTGG TGATATACAT GGAGAGTCTG GCCTTAGCAC ATGCAGATGA GAAGTCCTTA 3420 

GGTACAATTC AACAGTGTTG TGATGCCATT GATCACCTAT GCCGCATCAT TGAAAAGAAG 3480 

CATGTTTCTT TAAATAAAGC AAAGAAACGA CGTTTGCCGC GAGGATTTCC ACCTTCCGCA 3540 

TCATTGTGTT TATTGGATCT GGTCAAGTGG CTTTTAGCTC ATTGTGGGAG GCCCCAGACA 3600 

GAATGTCGAC ACAAATCCAT TGAACTCTTT TATAAATTCG TTCCTTTATT GCCAGGCAAC 3660 

AGATCCCCTA ATTTGTGGCT GAAAGATGTT CTCAAGGAAG AAGGTGTCTC TTTTCTCATC 3720 

AACACCTTTG AGGGGGGTGG CTGTGGCCAG CCCTCGGGCA TCCTGGCCCA GCCCACCCTC 3 7 B0 

TTGTACCTTC GGGGGCCATT CAGCCTGCAG GCCACGCTAT GCTGGCTGGA CCTGCTCCTG 3840 
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GCCGCGTTGG AGTGCTACAA CACGTTCATT GGCGAGAGAA CTGTAGGAGC GCTCCAGGTC 3900 

CTAGGTACTG AAGCCCAGTC TTCACTTTTG AAAGCAGTGG CTTTCTTCTT AGAAAGCATT 3960 

GCCATGCATG ACATTATAGC AGCAGAAAAG TGCTTTGGCA CTGGGGCAGC AGGTAACAGA 4020 

ACAAGCCCAC AAGAGGGAGA AAGGTACAAC TACAGCAAAT GCACCGTTGT GGTCCGGATT 4080 

ATGGAGTTTA CCACGACTCT GCTAAACACC TCCCCGGAAG GATGGAAGCT CCTGAAGAAG 4140 

GACTTGTGTA ATACACACCT GATGAGAGTC CTGGTGCAGA CGCTGTGTGA GCCCGCAAGC 4200 

ATAGGTTTCA ACATCGGAGA CGTCCAGGTT ATGGCTCATC TTCCTGATGT TTGTGTGAAT 4260 

CTGATGAAAG CTCTAAAGAT GTCCCCATAC AAAGATATCC TAGAGACCCA TCTGAGAGAG 4320 

AAAATAACAG CACAGAGCAT TGAGGAGCTT TGTGCCGTCA ACTTGTATGG CCCTGACGCG 4380 

CAAGTGGACA GGAGCAGGCT GGCTGCTGTT GTGTCTGCCT GTAAACAGCT TCACAGAGCT 4440 

GGGCTTCTGC ATAATATATT ACCGTCTCAG TCCACAGATT TGCATCATTC TGTTGGCACA 4500 

GAACTTCTTT CCCTGGTTTA TAAAGGCATT GCCCCTGGAG ATGAGAGACA GTGT C TGCCT 4560 

TCTCTAGACC TCAGTTGTAA GCAGCTGGCC AGCGGACTTC TGGAGTTAGC CTTTGCTTTT 4620 

GGAGGACTGT GTGAGCGCCT TGTGAGTCTT CTCCTGAACC CAGCGGTGCT GTCCACGGCG 4680 

TCCTTGGGCA GCTCACAGGG CAGCGTCATC CACTTCTCCC ATGGGGAGTA TTTCTATAGC 4740 

TTGTTCTCAG AAACGATCAA CACGGAATTA TTGAAAAATC TGGATCTTGC TGTATTGGAG 4800 

CTCATGCAGT CTTCAGTGGA TAATACCAAA ATGGTGAGTG CCGTTTTGAA CGGCATGTTA 4860 

GACCAGAGCT TCAGGGAGCG AGCAAACCAG AAACACCAAG GACTGAAACT TGCGACTACA 4920 

ATTCTGCAAC ACTGGAAGAA GTGTGATTCA TGGTGGGCCA AAGATTCCCC TCTCGAAACT 4980 

AAAATGGCAG TGCTGGCCTT ACTGGCAAAA ATTTTACAGA TTGATTCATC TGTATCTTTT 5040 

AATACAAGTC ATGGTTCATT CCCTGAAGTC TTTACAACAT ATATTAGTCT ACTTGCTGAC 5100 

ACAAAGCTGG ATCTACATTT AAAGGGCCAA GCTGTCACTC TTCTTCCATT CTTCACCAGC 5160 

CTCACTGGAG GCAGTCTGGA GGAACTTAGA CGTGTTCTGG AGCAGCTCAT CGTTGCTCAC 5220 

TTCCCCATGC AGTCCAGGGA ATTTCCTCCA GGAACTCCGC GGTTCAATAA TTATGTGGAC 5280 

TGCATGAAAA AGTTTCTAGA TGCATTGGAA TTATCTCAAA GCCCTATGTT GTTGGAATTG 5340 

ATGACAGAAG TTCTTTGTCG GGAACAGCAG CATGTCATGG AAGAATTATT TCAATCCAGT 5400 

TTCAGGAGGA TTGCCAGAAG GGGTTCATGT GTCACACAAG TAGGCCTTCT GGAAAGCGTG 5460 

TATGAAATGT TCAGGAAGGA TGACCCCCGC CTAAGTTTCA CACGCCAGTC CTTTGTGGAC 5520 

CGCTCCCTCC TCACTCTGCT GTGGCACTGT AGCCTGGATG CTTTGAGAGA ATTCTTCAGC 5580 

ACAATTGTGG TGGATGCCAT TGATGTGTTG AAGTCCAGGT TTACAAAGCT AAATGAATCT 5640 

ACCTTTGATA CTCAAATCAC CAAGAAGATG GGCTACTATA AGATTCTAGA CGTGATGTAT 5700 

TCTCGCCTTC CCAAAGATGA TGTTCATGCT AAGGAATCAA AAATTAATCA AGTTTTCCAT 5760 

GGCTCGTGTA TTACAGAAGG AAATGAACTT ACAAAGACAT TGATTAAATT GTGCTACGAT 5820 

GCATTTACAG AGAACATGGC AGGAGAGAAT CAGCTGCTGG AGAGGAGAAG ACTTTACCAT 5880 

TGTGCAGCAT ACAACTGCGC CATATCTGTC ATCTGCTGTG TCTTCAATGA GTTAAAATTT 5940 

TACCAAGGTT TTCTGTTTAG TGAAAAACCA GAAAAGAACT TGCTTATTTT TGAAAATCTG 6000 

ATCGACCTGA AGCGCCGCTA TAATTTTCCT GTAGAAGTTG AGGTTCCTAT GGAAAGAAAG 6060 

AAAAAGTACA TTGAAATTAG GAAAGAAGCC AGAGAAGCAG CAAATGGGGA TTCAGATGGT 6120 

CCTTCCTATA TGTCTTCCCT GTCATATTTG GCAGACAGTA CCCTGAGTGA GGAAATGAGT 6180 

CAATTTGATT TCTCAACCGG AGTTCAGAGC TATTCATACA GCTCCCAAGA CCCTAGACCT 6240 

GCCACTGGTC GTTTTCGGAG ACGGGAGCAG CGGGACCCCA CGGTGCATGA TGATGTGCTG 6300 

GAGCTGGAGA TGGACGAGCT CAATCGGCAT GAGTGCATGG CGCCCCTGAC GGCCCTGGTC 6360 

AAGCACATGC ACAGAAGCCT GGGCCCGCCT CAAGGAGAAG AGGATTCAGT GCCAAGAGAT 6420 

CTTCCTTCTT GGATGAAATT CCTCCATGGC AAACTGGGAA ATCCAATAGT ACCATTAAAT 6480 

ATCCGTCTCT TCTTAGCCAA GCTTGTTATT AATACAGAAG AGGTCTTTCG CCCTTACGCG 6540 

AAGCACTGGC TTAGCCCCTT GCTGCAGCTG GCTGCTTCTG AAAACAATGG AGGAGAAGGA 6600 

ATTCACTACA TGGTGGTTGA GATAGTGGCC ACTATTCTTT CATGGACAGG CTTGGCCACT 6660 

CCAACAGGGG TCCCTAAAGA TGAAGTGTTA GCAAATCGAT TGCTTAATTT CCTAATGAAA 6720 

CATGTCTTTC ATCCAAAAAG AGCTGTGTTT AGACACAACC TTGAAATTAT AAAGACCCTT 6780 

GTCGAGTGCT GGAAGGATTG TTTATCCATC CCTTATAGGT TAATATTTGA AAAGTTTTCC G840 

GGTAAAGATC CTAATTCTAA AGACAACTCA GTAGGGATTC AATTGCTAGG CATCGTGATG 6900 

GCCAATGACC TGCCTCCCTA TGACCCACAG TGTGGCATCC AGAGTAGCGA ATACTTCCAG 6960 

GCTTTGGTGA ATAATATGTC CTTTGTAAGA TATAAAGAAG TGTATGCCGC TGCAGCAGAA 7020 

GTTCTAGGAC TTATACTTCG ATATGTTATG GAGAGAAAAA ACATACTGGA GGAGTCTCTG 7080 

TGTGAACTGG TTGCGAAACA ATTGAAGCAA CATCAGAATA CTATGGAGGA CAAGTTTATT 7140 

GTGTGCTTGA ACAAAGTGAC CAAGAGCTTC CCTCCTCTTG CAGACAGGTT CATGAATGCT 7200 

GTGTTCTTTC TGCTGCCAAA ATTTCATGGA GTGTTGAAAA CACTCTGTCT GGAGGTGGTA 7260 

CTTTGTCGTG TGGAGGGAAT GACAGAGCTG TACTTCCAGT TAAAGAGCAA GGACTTCGTT 7320 

CAAGTCATGA GACATAGAGA TGATGAAAGA CAAAAAGTAT GTTTGGACAT AATTTATAAG 7380 

ATGATGCCAA AGTTAAAACC AGTAGAACTC CGAGAACTTC TGAACCCCGT TGTGGAATTC 7440 

GTTTCCCATC CTTCTACAAC AT GTAGGGAA CAAATGTATA ATATTCTCAT GTGGATTCAT 7500 

GATAATTACA GAGATCCAGA AAGTGAGACA GATAATGACT CCCAGGAAAT ATTTAAGTTG 7560 

GCAAAAGATG TGCTGATTCA AGGATTGATC GATGAGAACC CTGGACTTCA ATTAATTATT 7620 

CGAAATTTCT GGAGCCATGA AACTAGGTTA CCTTCAAATA CCTTGGACCG GTTGCTGGCA 7680 

CTAAATTCCT TATATTCTCC TAAGATAGAA GTGCACTTTT TAAGTTTAGC AACAAATTTT 7740 

CTGCTCGAAA TGACCAGCAT GAGCCCAGAT TATCCAAACC CCATGTTCGA GCATCCTCTG 7800 

TCAGAATGCG AATTTCAGGA ATATACCATT GATTCTGATT GGCGTTTCCG AAGTACTGTT 7860 

CTCACTCCGA TGTTTGTGGA GACCCAGGCC TCCCAGGGCA CTCTCCAGAC CCGTACCCAG 7920 

GAAGGGTCCC TCTCAGCTCG CTGGCCAGTG GCAGGGCAGA TAAGGGCCAC CCAGCAGCAG 7980 

CATGACTTCA CACTGACACA GACTGCAGAT GGAAGAAGCT CATTTGATTG GCTGACCGGG 8040 

AGCAGCACTG ACCCGCTGGT CGACCACACC AGTCCCTCAT CTGACTCCTT GCT GTTTG CC 8100 

CACAAGAGGA GTGAAAGGTT ACAGAGAGCA CCCTTGAAGT CAGTGGGGCC TGATTTTGGG 8160 

AAAAAAAGGC TGGGCCTTCC AGGGGACGAG GTGGATAACA AAGTGAAAGG TGCGGCCGGC 8220 

CGGACGGACC TACTACGACT GCGCAGACGG TTTATGAGGG ACCAGGAGAA GCTCAGTTTG 8280 

ATGTATGCCA GAAAAGGCGT TGCTGAGCAA AAACGAGAGA AGGAAATCAA GAGTGAGTTA 8340 

AAAATGAAGC AGGATGCCCA GGTCGTTCTG TACAGAAGCT ACCGGCACGG AGACCTTCCT 8400 

GACATTCAGA TCAAGCAGAG CAGCCTCATC ACCCCGTTAC AGGCCGTGGC CCAGAGGGAC 8460 

CCAATAATTG CAAAACAGCT CTTTAGCAGC TTGTTTTCTG GAATTTTGAA AGAGATGGAT 8520 

AAATTTAAGA CACTGTCTGA AAAAAACAAC ATCACTCAAA AGTTGCTTCA AGACTTCAAT 8580 

CGTTTTCTTA ATACCACCTT CTCTTTCTTT CCACCCTTTG TCTCTTGTAT TCAGGACATT 8640 

AGCTGTCAGC ACGCAGCCCT GCTGAGCCTC GACCCAGCGG CTGTTAGCGC TGGTTGCCTG 8700 

GCCAGCCTAC AGCAGCCCGT GGGCATCCGC CTGCTAGAGG AGGCTCTGCT CCGCCTGCTG 8760 

CCTGCTGAGC TGCCTGCCAA GCGAGTCCGT GGGAAGGCCC GCCTCCCTCC TGATGTCCTC 8820 

AGATGGGTGG AGCTTGCTAA GCTGTATAGA TCAATTGGAG AATACGACGT CCTCCGTGGG 8880 

ATTTTTACCA GTGAGATAGG AACAAAGCAA ATCACTCAGA GTGCATTATT AGCAGAAGCC 8940 

AGAAGTGATT ATTCTGAAGC TGCTAAGCAG TATGATGAGG CTCTCAATAA ACAAGACTGG 9000 

GTAGATGGTG AGCCCACAGA AGCCGAGAAG GATTTTTGGG AACTTGCATC CCTTGACTGT 9060 
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TACAACCACC TTGCTGAGTG GAAATCACTT GAATACTGTT CTACAGCCAG TATAGACAGT 9120 

GAGAACCCCC CAGACCTAAA TAAAATCTGG AGTGAACCAT TTTATCAGGA AACATATCTA 9180 

CCTTACATGA TCCGCAGCAA GCTGAAGCTG CTGCTCCAGG GAGAGGCTGA CCAGTCCCTG 9240 

CTGACATTTA TTGACAAAGC TATGCACGGG GAGCTCCAGA AGGCGATTCT AGAGCTTCAT 9300 
5 TACAGTCAAG AGCTGAGTCT GCTTTACCTC CTGCAAGATG ATGTTGACAG AGCCAAATAT 9360 

TACATTCAAA ATGGCATTCA GAGTTTTATG CAGAATTATT CTAGTATTGA TGTCCTCTTA 9420 

CACCAAAGTA GACTCACCAA ATTGCAGTCT GTACAGGCTT TAACAGAAAT TCAGGAGTTC 9480 

ATCAGCTTTA TAAGCAAACA AGGCAATTTA TCATCTCAAG TTCCCCTTAA GAGACTTCTG 9540 
_ AACACCTGGA CAAACAGATA TCCAGATGCT AAAATGGACC CAATGAACAT CTGGGATGAC 9600 
10 ATCATCACAA ATCGATGTTT CTTTCTCAGC AAAATAGAGG AGAAGCTTAC CCCTCTTCCA 9660 

GAAGATAATA GTATGAATGT GGATCAAGAT GGAGACCCCA GTGACAGGAT GGAAGTGCAA 9720 

GAGCAGGAAG AAGATATCAG CTCCCTGATC AGGAGTTGCA AGTTTTCCAT GAAAATGAAG 9780 

ATGATAGACA GTGCCCGGAA GCAGAACAAT TTCTCACTTG CTATGAAACT ACTGAAGGAG 9840 

CTGCATAAAG AGTCAAAAAC CAGAGACGAT TGGCTGGTGA GCTGGGTGCA GAGCTACTGC 9900 
15 . CGCCTGAGCC ACTGCCGGAG CCGGTCCCAG GGCTGCTCTG AGCAGGTGCT CACTGTGCTG 9960 

AAAACAGTCT CTTTGTTGGA TGAGAACAAC GTGTCAAGCT ACTTAAGCAA AAATATTCTG 10020 

GCTTTCCGTG ACCAGAACAT TCTCTTGGGT ACAACTTACA GGATCATAGC GAATGCTCTC 10080 

AGCAGTGAGC CAGCCTGCCT TGCTGAAATC GAGGAGGACA AGGCTAGAAG AATCTTAGAG 10140 

CTTTCTGGAT CCAGTTCAGA GGATTCAGAG AAGGTGATCG CGGGTCTGTA CCAGAGAGCA 10200 
20 TTCCAGCACC TCTCTGAGGC TGTGCAGGCG GCTGAGGAGG AGGCCCAGCC TCCCTCCTGG 10260 

AGCTGTGGGC CTGCAGCTGG GGTGATTGAT GCTTACATGA CGCTGGCAGA TTTCTGTGAC 10320 

CAACAGCTGC GCAAGGAGGA AGAGAATGCA TCAGTTATTG ATTCTGCAGA ACTGCAGGCG 10380 

TATCCAGCAC TTGTGGTGGA GAAAATGTTG AAAGCTTTAA AATTAAATTC CAATGAAGCC 10440 

AGATTGAAGT TTCCTAGATT ACTTCAGATT ATAGAACGGT ATCCAGAGGA GACTTTGAGC 10500 
25 CTCATGACAA AAGAGATCTC TTCCGTTCCC TGCTGGCAGT TCATCAGCTG GATCAGCCAC 10560 

ATGGTGGCCT TACTGGACAA AGACCAAGCC GTTGCTGTTC AGCACTCTGT GGAAGAAATC 10620 

ACTGATAACT ACCCGCAGGC TATTGTTTAT CCCTTCATCA TAAGCAGCGA AAGCTATTCC 10680 

TTCAAGGATA CTTCTACTGG TCATAAGAAT AAGGAGTTTG TGGCAAGGAT TAAAAGTAAG 10740 

TTGGATCAAG GAGGAGTGAT TCAAGATTTT ATTAATGCCT TAGATCAGCT CTCTAATCCT 10800 
30 GAACTGCTCT TTAAGGATTG GAGCAATGAT GTAAGAGCTG AACTAGCAAA AACCCCTGTA 10860 

AATAAAAAAA ACATTGAAAA AATGTATGAA AGAATGTATG CAGCCTTGGG TGACCCAAAG 10920 

GCTCCAGGCC TGGGGGCCTT TAGAAGGAAG TTTATTCAGA CTTTTGGAAA AGAATTTGAT 10980 

AAACATTTTG GGAAAGGAGG TTCTAAACTA CTGAGAATGA AGCTCAGTGA CTTCAACGAC 11040 

ATTACCAACA TGCTACTTTT AAAAATGAAC AAAGACTCAA AGCCCCCTGG GAATCTGAAA 11100 
35 GAATGTTCAC CCTGGATGAG CGACTTCAAA GTGGAGTTCC TGAGAAATGA GCTGGAGATT 11160 

CCCGGTCAGT ATGACGGTAG GGGAAAGCCA TTGCCAGAGT ACCACGTGCG AATCGCCGGG 11220 

TTTGATGAGC GGGTGACAGT CATGGCGTCT CTGCGAAGGC CCAAGCGCAT CATCATCCGT 11280 

GGCCATGACG AGAGGGAACA CCCTTTCCTG GTGAAGGGTG GCGAGGACCT GCGGCAGGAC 11340 

CAGCGCGTGG AGCAGCTCTT CCAGGTCATG AATGGGATCC TGGCCCAAGA CTCCGCCTGC 11400 
40 AGCCAGAGGG CCCTGCAGCT GAGGACCTAT AGCGTTGTGC CCATGACCTC CAGGTTAGGA 11460 

TTAATTGAGT GGCTTGAAAA TACTGTTACC TTGAAGGACC TTCTTTTGAA CACCATGTCC 11520 

CAAGAGGAGA AGGCGGCTTA CCTGAGTGAT CCCAGGGCAC CGCCGTGTGA ATATAAAGAT 11580 

TGGCTGACAA AAATGTCAGG AAAACATGAT GTTGGAGCTT ACATGCTAAT GTATAAGGGC 11640 

GCTAATCGTA CTGAAACAGT CACGTCTTTT AGAAAACGAG AAAGTAAAGT GCCTGCTGAT 11700 
45 CTCTTAAAGC GGGCCTTCGT GAGGATGAGT ACAAGCCCTG AGGCTTTCCT GGCGCTCCGC 11760 

TCCCACTTCG CCAGCTCTCA CGCTCTGATA TGCATCAGCC ACTGGATCCT CGGGATTGGA 11820 

GACAGACATC TGAACAACTT TATGGTGGCC ATGGAGACTG GCGGCGTGAT CGGGATCGAC 11880 

TTTGGGCATG CGTTTGGATC CGCTACACAG TTTCTGCCAG TCCCTGAGTT GATGCCTTTT 11940 

CGGCTAACTC GCCAGTTTAT CAATCTGATG TTACCAATGA AAGAAACGGG CCTTATGTAC 12000 
50 AGCATCATGG TACACGCACT CCGGGCCTTC CGCTCAGACC CTGGCCTGCT CACCAACACC 12060 

ATGGATGTGT TTGTCAAGGA GCCCTCCTTT GATTGGAAAA ATTTTGAACA GAAAATGCTG 12120 

AAAAAAGGAG GGTCATGGAT TCAAGAAATA AATGTTGCTG AAAAAAATTG GTACCCCCGA 12180 

CAGAAAATAT GTTACGCTAA GAGAAAGTTA GCAGGTGCCA ATCCAGCAGT CATTACTTGT 12240 

GATGAGCTAC TCCTGGGTCA TGAGAAGGCC CCTGCCTTCA GAGACTATGT GGCTGTGGCA 12300 
55 CGAGGAAGCA AAGATCACAA CATTCGTGCC CAAGAACCAG AGAGTGGGCT TTCAGAAGAG 12360 

ACTCAAGTGA AGTGCCTGAT GGACCAGGCA ACAGACCCCA ACATCCTTGG CAGAACCTGG 12420 

GAAGGATGGG AGCCCTGGAT GTGAGGTCTG TGGGAGTCTG CAGATAGAAA GCATTACATT 12480 

GTTTAAAGAA TCTACTATAC TTTGGTTGGC AGCATTCCAT GAGCTGATTT TCCTGAAACA 12540 

CTAAAGAGAA ATGTCTTTTG TGCTACAGTT TCGTAGCATG AGTTTAAATC AAGATTATGA 12600 
60 TGAGTAAATG TGTATGGGTT AAATCAAAGA TAAGGTTATA GTAACATCAA AGATTAGGTG 12660 

AGGTTTATAG AAAGATAGAT ATCCAGGCTT ACCAAAGTAT TAAGTCAAGA ATATAATATG 12720 

TGATCAGCTT TCAAAGCATT TACAAGTGCT GCAAGTTAGT GAAACAGCTG TCTCCGTAAA 12780 

TGGAGGAAAT GTGGGGAAGC CTTGGAATGC CCTTCTGGTT CTGGCACATT GGAAAGCACA 12840 
r CTCAGAAGGC TTCATCACCA AGATTTTGGG AGAGTAAAGC TAAGTATAGT TGATGTAACA 12900 

65 TTGTAGAAGC AGCATAGGAA CAATAAGAAC AATAGGTAAA GCTATAATTA TGGCTTATAT 12960 

TTAGAAATGA CTGCATTTGA TATTTTAGGA TATTTTTCTA GGTTTTTTCC TTTCATTTTA 13020 

TTCTCTTCTA GTTTTGACAT TTTATGATAG ATTTGCTCTC TAGAAGGAAA CGTCTTTATT 13080 

TAGGAGGGCA AAAATTTTGG TCATAGCATT CACTTTTGCT ATTCCAATCT ACAACTGGAA 13140 

GATACATAAA AGTGCTTTGC ATTGAATTTG GGATAACTTC AAAAATCCCA TGGTTGTTGT 13200 
70 TAGGGATAGT ACTAAGCATT TCAGTTCCAG GAGAATAAAA GAAATTCCTA TTTGAAATGA 13260 

ATTCCTCATT TGGAGGAAAA AAAGCATGCA TTCTAGCACA ACAAGATGAA ATTATGGAAT 13320 

ACAAAAGTGG CTCCTTCCCA TGTGCAGTCC CTGTCCCCCC CCGCCAGTCC TCCACACCCA 13380 

AACTGTTTCT GATTGGCTTT TAGCTTTTTG TTGTTTTTTT TTTTCCTTCT AACACTTGTA 13440 

TTTGGAGGCT CTTCTGTGAT TTTGAGAAGT ATACTCTTGA GTGTTTAATA AAGTTTTTTT 13500 
75 CCAAAAGTA 

Seq ID NO: 99 Protein sequence: 
Protein Accession #: NP_008835.5 

80 1 11 21 31 41 51 
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MAGSGAGVRC SLLRLQETLS AADRCGAALA GHQLIRGLGQ ECVLSSSPAV LALQTSfcVFS 60 

RDFGLLVFVR KSLNSIEFRE CREEILKFLC IFLEKMGQKI APYSVEIKNT CTSVYTKDRA 120 

AKCKIPALDL LIKLLQTFRS SRLMDEFKIG ELFSKFYGEL ALKKKIPDTV LEKVYELLGL 180 

85 LGEVHPSEMI NNAENLFRAF LGELKTQMTS AVREPKLPVL AGCLKGLSSL LCNFTKSMEE 240 

DPQTSREIFN FVLKAIRPQI DLKRYAVPSA GLRLFALHAS QFSTCLLDNY VSLFEVLLKW 300 

CAHTNVELKK AALSALESFL KQVSNMVAKN AEMHKNKLQY FMEQFYGIIR NVDSNNKELS 360 
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IAIRGYGLFA GPCKVINAKD VDFMYVBLIQ RCKQMFLTQT DTGDDRVYQM PSPLQSVASV 420 

LLYLDTVPEV YTPVLEHLW MQIDSFPQYS PKMQLVCCRA IVKVFLALAA KGPVLRNCIS 480 

TWHQGliIRI CSKPWLPKG PESESBDHRA SGEVRTGKWK VPTYKDYVDL FRHLI*SSDQM 540 

MDSILADEAF FSVNSSSESL NHLLYDEFVK SVLKIVEKLD LTLBIQTVGE QENGDEAPGV 600 

WMIPTSDPAA NLHPAKPKDP SAFINLVEFC REILPEKQAE FFEPWVYSFS YELILQSTRL 660 

PLISGFYKLL SITVRNAKKI KYFEGVSPKS LKHSPEDPEK YSCFALFVKF GKEVAVKMKQ 720 

YKDELLASCL TFLLSLFHNI IELDVRAYVP ALQMAFKLGL SYTPLAEVGL NALEEWSIYI 780 

DRHVMQPYYK DILPCLDGYL KTSALSDETK NNWEVSALSR AAQKGFNKW LKHLKKTKNL 840 

SSNEAISLEE IRIRWQMLG SLGGQINKNL LTVTSSDEMM KSYVAWDREK RLSFAVPFRE 900 

MKPVIFLDVF LPRVTELALT ASDRQTKVAA CELLHSMVMF MLGKATQMPE GGQGAPPMYQ 960 

LYKRTFPVLL RLACDVDQVT RQLYEPLVMQ LIHWFTNNKK FESQDTVALL EAILDGIVDP 1020 

VDSTLRDFCG RCIREFLKWS IKQITPQQQE KSPVNTKSLF KRLYSLALHP NAFKRLGASL 1080 

AFNNIYREFR EEESLVEQFV FBALVIYMES LALAHADEKS LGTIQQCCDA IDHLCRX IEK 1140 

KHVSLNKAKK RRLPRGFPPS ASLCLLDLVK WLLAHCGRPQ TECRHKSIEL FYKFVPLLPG 1200 

NRSPNliWLKD VLKEEGVSPL INTFEGGGCG QPSGILAQPT LLYLRGPFSL QATLCWLDLL 1260 

LAALECYNTF IGERTVGALQ VLGTEAQSSL LKAVAFFLES IAMHDIIAAE KCFGTGAAGN 1320 

RTSPQEGERY NYSKCTVWR IMEFTTTLLN TSPEGWKLLK KDLCNTHLMR VLVQTLCEPA 1380 

SIGFNIGDVQ VMAHLPDVCV NLMKALKMSP YKDILETHLR EKITAQSIEE LCAVNLYGPD 1440 

AQVDRSRLAA WSACKQLHR AGLLHNILPS QSTDLHHSVG TELLSLVYKG IAPGDERQCL 1500 

PSLDLSCKQL ASGLLELAFA FGGLCBRLVS LLLNPAVLST ASLGSSQGSV IHFSHGEYFY 1S60 

SLFSETINTE LLKNLDLAVL ELMQSSVDNT KMVSAVLNGM LDQSFRERAN QKHQGLKLAT 1620 

TILQHWKKCD SWWAKDSPLE TKMAVLALLA KILQIDSSVS FNTSHGSFPE VFTTYISLLA 1680 

DTKLDLHLKG QAVTLLPFPT SLTGG5LEEL RRVLEQLIVA HFPMQSREPP PGTPRFNNYV 1740 

DCMKKFLDAL ELSQSPMLLE LMTEVLCREQ QHVMEELFQS SFRRIARRGS CVTQVGLLES 1800 

VYEMFRKDDP RLSFTRQSFV DRSLLTLLWH CSLDALREFF STIWDAIDV LKSRFTKLNE 1860 

STFDTQITKK MGYYKILDVM YSRLPKDDVH AKESKINQVF HGSCITEGNB LTKTLIKLCY 1920 

DAFTENMAGE NQLLERRRLY HCAAYNCAIS VICCVFNELK FYQGFLFSEK PEKNLLIFEN 1980 

LIDLKRRYNF PVEVEVPMER KKKYIEIRKE AREAAHGDSD GPSYMSSLSY LADSTLSEEM 2040 

SQFDFSTGVQ SYSYSSQDPR PATGRFRRRE QRDPTVHDDV LELEMDELNR HECMAPLTAL 2100 

VKHMHRSLGP PQGEEDSVPR DLPSWMKFLH GKLGNPIVPL NIRLFLAKLV INTEEVFRPY 2160 

AKHWLSPLLQ LAASENNGGE GIHYMWEIV ATILSWTGLA TPTGVPKDEV LANRLLNFLM 2220 

KHVFHPKRAV FRHNLEXIKT LVECWKDCLS IPYRLIFEKF SGKDPNSKDN SVGIQLLGIV 2280 

MANDLPPYDP QCGIQSSEYF QALVNNMSFV RYKEVYAAAA EVLGLILRYV MERKNILEES 2340 

LCELVAKQLK QHQNTMEDKF IVCLNKVTKS FPPLADRFMN AVFFLLPKFH GVLKTLCLEV 2400 

VLCRVEGMTE LYFQLKSKDF VQVMRHRDDE RQKVCLDI I Y KMMPKLKPVE LRELLNPWE 2460 

FVSHPSTTCR EQMYNILMWI HDNYRDPESE TDNDSQEIFK LAKDVLIQGL IDENPGLQLI 2520 

IRNFWSHETR LPSNTLDRUj ALWSLYSPKI EVHFLSLATN FLLEMTSMSP DYPNPMFEHP 2580 

LSECEFQEYT IDSDWRFRST VLTPMFVETQ ASQGTLQTRT QEGSLSARWP VAGQIRATQQ 2640 

QHDFTLTQTA DGRSSFDWLT GSSTDPLVDH TSPSSDSLLF AHKRSERLQR APLKSVGPDF 2700 

GKKRLGLPGD EVDNKVKGAA GRTDLLRLRR RFMRDQEKLS LMYARKGVAE QKREKEIKSE 2760 

LKMKQDAQW LYRSYRHGDL PDIQIKHSSI* ITPLQAVAQR DPIIAKQLF8 SLFSGILKEM 2820 

DKFKTLSEKN NITQKLLQDF NRFLNTTFSF FPPFVSCIQD ISCQHAALLS LDPAAVSAGC 2880 

LASLQQPVGI RLLEEALLRL LPAELPAKRV RGKARLPPDV LRWVELAKLY RSIGEYDVLR 2940 

GIFTSEIGTK QITQSALLAE ARSDYSEAAK QYDEALNKQD WVDGEPTEAE KDFWELASLD 3000 

CYNHLAEWKS LEYCSTASID SENPPDLNKI WSEPFYQETY LPYMIRSKLK LLLQGEADQS 3060 

LLTFIDKAMH GELQKAILEL HYSQELSLLY LLQDDVDRAK YYIQNGIQSF MQNYSSIDVL 3120 

LHQSRLTKLQ SVQALTEIQE FISFISKQGN LSSQVPLKRL LNTWTNRYPD AKMDPMNIWD 3180 

DIITNRCFFL SKIEEKLTPL PEDNSMNVDQ DGDPSDRMEV QEQEEDISSL IRSCKFSMKM 3240 

KMIDSARKQN NFSLAMKLLK ELHKESKTRD DWLVSWVQSY CRLSHCRSRS QGCSEQVLTV 3300 

LKTVSLLDEN NVSSYLSKNI LAFRDQNILL GTTYRI I ANA LSSEPACLAE IEEDKARRIL 3360 

ELSGSSSEDS EKVIAGLYQR AFQHLSEAVQ AAEEEAQPPS WSCGPAAGVI DAYMTLADFC 3420 

DQQIjRKEEEN ASVIDSAELQ AYPALWEKM LKALKLNSNE ARLKFPRLLQ IIERYPEETL 3480 

SLMTKEISSV PCWQFISWIS HMVALLDKDQ AVAVQHSVEE ITDNYPQAIV YPFIISSESY 3540 

SFKDTSTGHK NKEFVARIKS KLDQGGVIQD FINALDQLSN PELLFKDWSN DVRAELAKTP 3600 

VNKKNIEKMY ERMYAALGDP KAPGLGAFRR KFIQTFGKEF DKHFGKGGSK LLRMKLSDFN 3660 

DITNMLLLKM NKDSKPPGNL KECSPWMSDF KVEFIiRNELE I PGQYDGRGK PLPEYHVRIA 3720 

GFDERVTVMA SLRRPKRIII RGHDEREHPF LVKGGEDLRQ DQRVEQLFQV MNGILAQDSA 3780 

CSQRALQLRT YSWPMTSRL GLIEWLENTV TLKDLLLNTM SQEEKAAYLS DPRAPPCEYK 3840 

DWLTKMSGKH DVGAYMLMYK GANRTETVTS FRKRESKVPA DLLKRAFVRM STSPEAFLAL 3900 

RSHFASSHAL ICISHWILGI GDRHLNNFMV AMETGGVIGI DFGHAFGSAT QFLPVPELMP 3960 

FRLTRQFINL MLFMXETGLM YSIMVHALRA FRSDPGLLTN TMDVFVKEPS FDWKNFEQKM 4020 

LKKGGSWIQE INVAEKNWYP RQKICYAKRK LAGANPAVIT CDELLLGHEK APAFRDYVAV 4080 
ARGSKDHNIR AQEPESGLSE ETQVKCLMDQ ATDPNILGRT WEGWEPWM 
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ATGTGAAGGC ACAAGCTGCT GTTATATACA ACAGAGTGAA CTGAGCATCA GTCAGAAAAA 60 

GTCTATGTTT GCAGAAATAC AGATCCAAGA CAAAGACAGG ATGGGCACTG CTGGAAAAGT 120 

TATTAAATGC AAAGCAGCTG TGCTTTGGGA GCAGAAGCAA CCCTTCTCCA TTGAGGAAAT 180 

AGAAGTTGCC CCACCAAAGA CTAAAGAAGT TCGCATTAAG ATTTTGGCCA CAGGAATCTG 240 

TCGCACAGAT GACCATGTGA TAAAAGGAAC AATGGTGTCC AAGTTTCCAG TGATTGTGGG 300 

ACATGAGGCA ACTGGGATTG TAGAGAGCAT TGGAGAAGGA GTGACTACAG TGAAACCAGG 360 

TGACAAAGTC ATCCCTCTCT TTCTGCCACA ATGTAGAGAA TGCAATGCTT GTCGCAACCC 420 

AGATGGCAAC CTTTGCATTA GGAGCGATAT TACTGGTCGT GGAGTACTGG CTGATGGCAC 480 

CACCAGATTT ACATGCAAGG GCAAACCAGT ACACCACTTC ATGAACACCA GTACATTTAC 540 

CGAGTACACA GTGGTGGATG AATCTTCTGT TGCTAAGATT GATGATGCAG CTCCTCCTGA 600 

GAAAGTCTGT TTAATTGGCT GTGGGTTTTC CACTGGATAT GGCGCTGCTG TTAAAACTGG 660 

CAAGGTCAAA CCTGGTTCCA CTTGCGTCGT CTTTGGCCTG GGAGGAGTTG GCCTGTCAGT 720 

CATCATGGGC TGTAAGTCAG CTGGTGCATC TAGGATCATT GGGATTGACC TCAACAAAGA 780 

CAAATTTGAG AAGGCCATGG CTGTAGGTGC CACTGAGTGT ATCAGTCCCA AGGACTCTAC 840 

CAAACCCATC AGTGAGGTGC TGTCAGAAAT GACAGGCAAC AACGTGGGAT ACACCTTTGA 900 

AGTTATTGGG CATCTTGAAA CCATGATTGA TGCCCTGGCA TCCTGCCACA TGAACTATGG 960 

GACCAGCGTG GTTGTAGGAG TTCCTCCATC AGCCAAGATG CTCACCTATG ACCCGATGTT 1020 



228 



WO 02/086443 

GCTCTTCACT GGACGCACAT GGAAGGGATG TGTCTTTGGA GGTTTGAAAA GCAGAGATGA 1080 

TGTCCCAAAA CTAGTGACTG AGTTCCTGGC AAAGAAATTT GACCTGGACC AGTTGATAAC 1140 

TCATGTTTTA CCATTTAAAA AAATCAGTGA AGGATTTGAG CTGCTCAATT CAGGACAAAG 1200 

CATTCGAACG GTCCTGACGT TTTGAGATCC AAAGTGGCAG GAGGTCTGTG TTGTCATGGT 1260 

GAACTGGAGT TTCTCTTGTG AGAGTTCCCT CATCTGAAAT CATGTATCTG TCTCACAAAT 1320 

ACAAGCATAA GTAGAAGATT TGTTGAAGAC ATAGAACCCT TATAAAGAAT TATTAACCTT 1380 

TATAAACATT TAAAGTCTTG TGAGCACCTG GGAATTAGTA TAATAACAAT GTTAATATTT 1440 

TTGATTTACA TTTTGTAAGG CTATAATTGT ATCTTTTAAG AAAACATACA CTTGGATTTC 1500 

TATGTTGAAA TGGAGATTTT TAAGAGTTTT AACCAGCTGC TGCAGATATA TAACTCAAAA 1560 

CAGATATAGC GTATAAAGAT ATAGTAAATG CATCTCCCAG AGTAATATTC ACTTAACACA 1620 

TTGAAACTAT TATTTTTTAG ATTTGAATAT AAATGTATTT TTTAAACACT TGTTATGAGT 1680 

TAACTTGGAT TACATTTTGA AATCAGTTCA TTCCATGATG CATATTACTG GATTAGATTA 1740 

AGAAAGACAG AAAAGATTAA GGGACGGGCA CATTTTTCAA CGATTAAGAA TCATCATTAC 1800 

ATAACTTGGT GAAACTGAAA AAGTATATCA TATGGGTACA CAAGGCTATT TGCCAGCATA 1860 

TATTAATATT TTAGAAAATA TTCCTTTTGT AATACTGAAT ATAAACATAG AGCTAGAGTC 1920 

ATATTATCAT ACTTATCATA ATGTTCAATT TGATACAGTA GAATTGCAAG TCCCTAAGTC 1980 

CCTATTCACT GTGCTTAGTA GTGACTCCAT TTAATAAAAA GTGTTTTTAG TTTTTAACAA 2040 
CTAAACCG 

Seq ID NO: 101 Protein sequence: 
Protein Accession #: NP_000664 

1 11 21 31 41 51 

I | I I I I 

MGTAGKVIKC KAAVLWEQKQ PFSIEEIEVA PPKTKEVRIK ILATGICRTD DHVIKGTMVS 60 

KFPVIVGHEA TGIVESIGEG VTTVKPGDKV IPLFLPQCRE CNACRNPDGN LCIRSDITGR 120 

GVLADGTTRP TCKGKPVHHF MNTSTFTEYT WDESSVAKI DDAAPPEKVC LIGCGFSTGY 1B0 

GAAVKTGKVK PGSTCWFGL GGVGLSVIMG CKSAGASRII GIDIiNKDKFE KAMAVGATEC 240 

ISPKDSTKPI SEVLSEMTGN NVGYTFEVIG HLETMIDALA SCHMNYGTSV WGVPPSAKM 300 

LTYDPMLIiFT GRTWKGCVFG GLKSRDDVPK LVTEFLAKKF DLDQLITHVL PFKKISEGFE 360 
LLNSGQSIRT VLTF 

Seq ID NO: 102 DNA sequence 

Nucleic Acid Accession NMJ)06783.1 

Coding sequence: 1..786 

1 11 21 31 41 51 

I I I I I I 

ATGGATTGGG GGACGCTGCA CACTTTCATC GGGGGTGTCA ACAAACACTC CACCAGCATC 60 

GGGAAGGTGT GGATCACAGT CATCTTTATT TTCCGAGTCA TGATCCTAGT GGTGGCTGCC 120 

CAGGAAGTGT GGGGTGACGA GCAAGAGGAC TTCGTCTGCA ACACACTGCA ACCGGGATGC 180 

AAAAATGTGT GCTATGACCA CTTTTTCCCG GTGTCCCACA TCCGGCTGTG GGCCCTCCAG 240 

CTGATCTTCG TCTCCACCCC AGCGCTGCTG GTGGCCATGC ATGTGGCCTA CTACAGGCAC 300 

GAAACCACTC GCAAGTTCAG GCGAGGAGAG AAGAGGAATG ATTTCAAAGA CATAGAGGAC 360 

ATTAAAAAGC ACAAGGTTCG GATAGAGGGG TCGCTGTGGT GGACGTACAC CAGCAGCATC 420 

TTTTTCCGAA TCATCTTTGA AGCAGCCTTT ATGTATGTGT TTTACTTCCT TTACAATGGG 480 

TACCACCTGC CCTGGGTGTT GAAATGTGGG ATTGACCCCT GCCCCAACCT TGTTGACTGC 540 

TTTATTTCTA GGCCAACAGA GAAGACCGTG TTTACCATTT TTATGATTTC TGCGTCTGTG 600 

ATTTGCATGC TGCTTAACGT GGCAGAGTTG TGCTACCTGC TGCTGAAAGT GTGTTTTAGG 660 

AGATCAAAGA GAGCACAGAC GCAAAAAAAT CACCCCAATC ATGCCCTAAA GGAGAGTAAG 720 

CAGAATGAAA TGAATGAGCT GATTTCAGAT AGTGGTCAAA ATGCAATCAC AGGTTTCCCA 780 
AGCTAA 

Seq ID NO: 103 Protein sequence: 
Protein Accession #: NP_006774.1 

1 11 21 31 41 SI 

111 111 

MDWGTLHTFI GGVNKHSTSI GKVWITVIFI FRVMILWAA QEVWGDEQED FVCNTLQPGC 60 

KNVCYDHFFP VSHIRLWALQ LIFVSTPALL VAMHVAYYRH ETTRKFRRGE KRNDFKDIED 120 

IKKHKVRIEG SLWWTYTSSI FFRIIFEAAF MYVFYFLYNG YHLPWVLKCG IDPCPNLVDC 180 

FISRPTEKTV FTIFMISASV ICMLLNVAEL CYLLLKVCFR RSKRAQTQKN KPNHALKESK 240 
QNEMNELISD SGQNAITGFP S 

Seq ID NO: 104 DNA sequence 
Nucleic Acid Accession #: NM_020411 
Coding sequence! 86-526 

1 11 21 31 41 51 

I I I I I I 

GGACCTGGGA AGGAGCATAG GACAGGGCAA GGCGGGATAA GGAGGGGCAC CACAGCCCTT 60 

AAGGCACGAG GGAACCTCAC TGCGCATGCT CCTTTGGTGC CCACCTCAGT GCGCATGTTC 120 

ACTGGGCGTC TTCCCATCGG CCCCTTCGCC AGTGTGGGGA ACGCGGCGGA GCTGTGAGCC 180 

GGCGACTCGG GTCCCTGAGG TCTGGATTCT TTCTCCGCTA CTGAGACACG GCGGACACAC 240 

ACAAACACAG AACCACACAG CCAGTCCCAG GAGCCCAGTA ATGGAGAGCC CCAAAAAGAA 300 

GAACCAGCAG CTGAAAGTCG GGATCCTACA CCTGGGCAGC AGACAGAAGA AGATCAGGAT 360 

ACAGCTGAGA TCCCAGTGCG CGACATGGAA GGTQATCTGC AAGAGCTGCA TCAGTCAAAC 420 

ACCGGGGATA AATCTGGATT TGGGTTCCGG CGTCAAGGTG AAGATAATAC CTAAAGAGGA 480 

ACACTGTAAA ATGCCAGAAG CAGGTGAAGA GCAACCACAA GTTTAAATGA AGACAAGCTG 540 

AAACAACGCA AGCTGGTTTT ATATTAGATA TTTGACTTAA ACTATCTCAA TAAAGTTTTG 600 
CAGCTTTCAC CAAAAAAAAA AAAAAA 



Seq ID NO: 105 Protein sequencer 
Protein Accession 8: NP_065144,1 

1 11 21 31 41 51 



229 
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I 



PCT/US02/12476 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



MLLWCPPQCA CSI/GVFPSAP SPVWGTRRSC EPATRVPEVW ILSPLLRHGG HTQTQNHTAS 
PRSPVMESPK KKKQQLKVGI LHLGSRQKKI RIQLRSQCAT WKVICKSCIS QTPGINLDLG 
SGVKVKIIPK EEHCKMPEAG EEQPQV 

Seq ID NO i 106 DNA sequence 
Nucleic Acid Accession #: J0412 9 
Coding sequence: 99-587 



1 

I 

CATCCCTCTG 
TCACCCTGGG 
AGGACCTGGA 
ACATCTCCCT 
CCACCCCCGA 
AGAAGAAGGT 
TGGCGAACGA 
AGGACACCAC 
AGGACGATGA 
GGTACTTGCT 
CCAGGAAGAC 
TTTCAAAGAA 
TCCTGCTGCA 
GCAGAGGTTA 



11 
I 

GCTCCAGAGC 
CGTGGCCCTG 
GCTCCCAAAG 
CATGGCGACA 
GGACAACCTG 
CCTTGGAGAG 
GGCCACGCTG 
CACCCCCATC 
GATCATGCAG 
GGACTTGAAA 
CAGACTCCCA 
TAACCACAGC 
CACCTGCACC 
TTAATAAACC 



21 
I 

TCAGAGCCAC 
GTCTGTGGTG 
TTGGCAGGGA 
CTGAAGGCCC 
GAGATCGTTC 
AAGACTGGGA 
CTCGATACTG 
CAGAGCATGA 
GGATTCATCA 
CAGATGGAAG 
CCCTTCCACA 
TCAGAAGACG 
ATTGCCATGG 
CTTGGAGCAT 



31 
I 

CCACAGCCGC 
TCCCGGCCAT 
CCTGGCACTC 
CTCTGAGGGT 
TGCACAGATG 
ATCCAAAGAA 
ACTACGACAA 
TGTGCCAGTA 
GGGCTTTCAG 
AGCCGTGCCG 
CCTCCAGAGC 
ATGACGTGGT 
GGAGGCTGCT 
G 



41 

I 

AGCCATGCTG 
GGACATCCCC 
CATGGCCATG 
CCACATCACC 
GGAGAACAAC 
GTTCAAGATC 
TTTCCTGTTT 
CCTGGCCAGA 
GCCCCTGCCC 
TTTCTAGCTC 
AGTGGGACTT 
CATCTGTGTC 
CCCTGGGGGC 



51 
I 

TGCCTCCTGC 
CAGACCAAGC 
GCGACCAACA 
TCACTGTTGC 
AGCTGTGTTG 
AACtATACGG 
CTCTGCCTAC 
GTCCTGGTGG 
AGGCACCTAT 
ACCTCCGCCT 
CCTCCTGCCC 
GCCATCCCCT 
AGAGTCTCTG 



Seq ID NO: 107 Protein sequence: 
Probein Accession ft: AAA60147 

21 



1 11 21 31 41 51 

I I I I I I 

MDIPQTKQDL ELPKLAGTWH SMAMATNNIS LMATLKAPLR VHITSLLPTP EDNLEIVLHR 
WENNSCVEKK VLGEKTGNPK KFKINYTVAN EATLLDTDYD NFLFLCLQDT TTPIQSMMCQ 
YLARVLVEDD EIMQGFIRAF RPLPRHLWYL LDLKQMEEPC RF 

Seq ID NO: 108 DNA sequence 

Nucleic Acid Accession ft: Eos sequence 

Coding sequence: 48-794 



1 

I ■ 

TCCCAGGCAG 
GTCTGATCCA 
TCATGAAAGG 
CAGTAGCCTA 
TTGAGCAGAA 
GGGAGAAGGT 
GCCACCTCAT 
GTGACTACTA 
ACTCAGCCCG 
CCAACCCCAT 
ACAGCCCCGA 
TGCACACCCT 
ACAACCTGAC 
AGCCCCAGAG 
TGCCGAGAGG 
CTCCAAAGGG 
CACTCTTCTT 
CGCACCCGCT 
CTGCCCCTGC 
GGACAGTGGC 
CGCGCGCGCC 
TTCCTCTCAA 



11 

I 

CAGTTAGCCC 
GAAGGCCAAG 
CGCCGTGGAG 
TAAGAACGTG 
AAGCAACGAG 
GGAGACTGAG 
CAAGGAGGCC 
CCGCTACCTG 
GTCAGCCTAC 
CCGCCTGGGC 
GGAGGCCATC 
CAGCGAGGAC 
ACTGTGGACG 
CTGAGTGTTG 
ACTAGTATGG 
CTCCGTGGAG 
GCAGCTGTTG 
TCCTCCCGAC 
TGCCTCTGAT 
AGGGGCTGGA 
AGTGCAAGAC 
TAAAGTTCCC 



21 
I 

GCCGCCCGCC 
CTGGCAGAGC 
AAGGGCGAGG 
GTGGGCGGCC 
GAGGGCTCGG 
CTCCAGGGCG 
GGGGACGCCG 
GCCGAGGTGG 
CAGGAGGCCA 
CTGGCCCTGA 
TCTCTGGCCA 
TCCTACAAAG 
GCCGACAACG 
CCCGCCACCG 
GGTGGGAGGC 
AGGGACTGGC 
AGCGCACCTA 
CCCAGGACCA 
CGTAGGAATT 
GATGGGTGTG 
CGAGATTGAG 
CTGTGACACT 



31 
I 

TGTGTGTCCC 
AGGCCGAACG 
AGCTCTCCTG 
AGAGGGCTGC 
AGGAGAAGGG 
TGTGCGACAC 
AGAGCCGGGT 
CCACCGGTGA 
TGGACATCAG 
ACTTTTCCGT 
AGACCACTTT 
ACAGCACCCT 
CCGGGGAAGA 
CCCCGCCCTG 
CCCACCCTTC 
AGAGCTGAGG 
ACCACTGGTC 
GGCTACTTCT 
GAGGAGTGTC 
TGTGTGTGTG 
GGAAAGCATG 
C 



41 
I 

CAGAGCCATG 
CTATGAGGAC 
CGAAGAGCGA 
CTGGAGGGTG 
GCCCGAGGTG 
CGTGCTGGGC 
CTTCTACCTG 
CGACAAGAAG 
CAAGAAGGAG 
CTTCCACTAC 
CGACGAGGCC 
GATCATGCAG 
GGGGGGCGAG 
CCCCCTCCAG 
TCCCCTAGGC 
CCACCTGGGG 
ATGCCCCCAC 
CCCCTCCTCT 
CCGCCTTGTG 
TGTGTGTGTG 
TCTGCTGGGT 



51 
I 

GAGAGAGCCA 
ATGGCAGCCT 
AACCTGCTCT 
CTGTCCAGTA 
CGTGAGTACC 
CTGCTGGACA 
AAGATGAAGG 
CGCATCATTG 
ATGCCGCCCA 
GAGATCGCCA 
ATGGCTGATC 
CTGCTGCGAG 
GCTCCCCAGG 
TCCCCCACCC 
GCTGTTCTTG 
CTGGGGATCC 
CCCTGCTCTC 
TGCCTCCCTC 
GCTGAGAACT 
TGTGTGTGTG 
GTGACCATGT 



Seq ID NO: 109 Protein sequence: 
Protein Accession ft: NP_006133.1 



11 



21 



31 41 51 

I I 1 I J I 

MERASLIQKA KLAEQAERYE DMAAFMKGAV EKGEELSCEE RNLLSVAYKN WGGQRAAWR 
VLSSIEQKSN EEGSEEKGPE VREYREKVET ELQGVCDTVL GLLDSHLIKE AGDAESRVFY 
LKMKGDYYRY LAEVATGDDK KRIIDSARSA YQEAMDISKK EMPPTNPIRL GLALNFSVFH 
YEIANSPEEA ISLAKTTFDE AMADLHTLSE DSYKDSTIiIM QLLRDNLTLW TADNAGEEGG 
EAPQEPQS 

Seq ID NO: 110 DNA Bequence 
Nucleic Acid Accession ft : NMJJ00695 
Coding sequence: 407-1564 



1 11 

I I 
CACGAGTTGG TTTGGGAGCT 
GAGGCCTGGG GGTAGGAGCA 
TGGAGGTGCA GCGAAGGACC 
CACACTGCGG CGGCTGCGTG 
GGCTGCGCAG CTCCAGGGCC 



21 
I 

GCCAGTCTCC 
GAGCCTGCGC 
CAGGGGCAGA 
AGGCCTTCAA 
TGGGCCACTT 



31 



41 



51 



TGGGAGGATC GCAGTCAGCA GAGCAGGGCT 
ATCTGGAGGC AGCATGTCCA AGAAAGGGAG 
GCCCACGCTG GGGATGGACC CCTTCGAGGA 
CTGAGGGCGC ACGCGGCCGG CCGAGTTCCG 
CCTTCAAGAA AACAAGCAGC TTCTGCGCGA 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 



60 
120 
180 
240 



60 
120 
180 
240 
300 
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WO 02/086443 

GGTGCTGGCC CAGGACCTGC ATAAGCCAGC 
TTGCCAGAAC GAGGTTGACT ACGCTCTCAA 
ACGGTCCACG AACCTGTTCA TGAAGCTGGA 
CCTGGTCCTC ATCATCGCAC CCTGGAACTA 
GGGCACCCTC CCOGCAGGGA ATTGGGTGGT 
AGAGAAGGTC CTGGCTGAGG TGCTGCCCCA 
GCTGGGC6GA CCCCAGGAGA CAGGGCAGCT 
CACAGGGAGC CCTCGTGTGG GCAAGATTGT 
TGTCACCCTG GAGCTGGGGG GCAAGAACCC 
GACCGTGGCC AACCGOGTGG CCTGGTTCTG 
CCCTGACTAC GTCCTGTGCA GCCCOQAGAT 
CACCATCACC CGTTTCTATG GCGACGACCC 
CAACCAGAAA CAGTTCCAGC GGCTGCGGGC 
GGGCCAGAGC AACGAGAGCG ATCGCTACAT 
GACGGAGCCT GTGATGCAGG AGGAGATCTT 
GAGCGTGGAC GAGGCCATCA AGTTCATCAA 
CTTCTCCAAC AGCAGACAGG TTGTGAACCA 
TGGAGGCAAT GAGGGCTTCA CCTACATATC 
CCACAGTGGG ATGGGCCGGT ACCACGGCAA 
CACCTGCCTG CTCGCCCCCT CCGGCCTGGA 
TACCGACTGG AACCAGCAGC TGTTACGCTG 
GTGAGCGTCC CACCCGCCTC CAACGGGTCA 
GCTTATGCTC CCAACTCACA TTGTTCCTCC 
TGGAGCTGTC ACATGACTGC ATCCTGCCTG 
TCTGGGGGAC GCTGCTCGAG AGAGGCCGAG 
CACCCCACCC TCCCCAATTC CAGCCCTTTG 
CACAGGGGCA GTGTCACCCT GGAAAATACA 
GAACGGTTGA GAGCGTGGAG CCCTCCAGGC 
TTCCACCTCT GCCCCATCCC AACTGCACCA 
CCCACACTGG TCTCTGCACC ACCCCTCTGG 
AGCTCCATCC ACTGGGAAAA CTGGGGTTTG 
CTGGGGGCAA GTCCCTTGAC TTCTCTGAGC 
CCAAAATGGA GTCACTTATG CCAAACTCTA 
CCCTCACACA CACATGCCCG TAACAGGATT 
AGACACAGGG CGTATGGAAA AGCACGTCCT 
GATGCTTACC TACCACGGCC GTCTCCACCA 
TGTGACTTAC AAACCTTGTT TAAAAGCTGC 
CCCTTGGCTG TGGCCCTCTG TGTATGCCTG 
GGAATCCTCT GCTCCTCCCA AATAAATTCA 



Seq ID NO: 111 Protein sequence: 
Protein Accession &: NP 000686 



PCT/US02/12476 



TTTCGAGGCA 
GAACCTTCAG 
CTCGGTCTTC 
CCCATTGAAC 
GCTGAAGCCG 
GTACCTGGAC 
GCTAGAGCAC 
CATGACTGCT 
CTGCTACGTG 
CTACTTCAAT 
GCAGGAGAGG 
CCAGAGCTCC 
ATTGCTGGGC 
CGCCCCCACG 
CGGGCCCATC 
CCGGCAGGAG 
GATGCTGGAG 
TCTGCTGTCC 
GTTCACCTTC 
GAAATTAAAG 
GGGCATGGGC 
CACAGAGAAA 
AGACCGCAGG 
CCAGGGCTGC 
AGGCCGCAGA 
CCCTCTCGGT 
GTGCCCTGCC 
CTTTGCTCTC 
GCACTGCCTC 
TTCACACCGC 
CATCACTCCA 
CTCAGTTTCC 
ATAAAATGGA 
TATCACCAAG 
CAAAGACTGT 
GAAAACCATC 
TTACATGGAC 
GGATCCTTCC 
TCTGTTC 



GACATATCTG 
GCCTGGATGA 
ATCTGGAAGG 
CTGACCCTGG 
TCAGAAATCA 
CAGAGCTGGT 
AAGTTGGACT 
GCCACCAAGC 
GACGACAACT 
GCOGGCCAGA 
CTGCTGCCCG 
CCAAACCTGG 
TGCGGCCGCG 
GTGCTGGTGG 
CTGCCCATOG 
AAGCCCCTGG 
CGGACCAGCA 
GTGCCATTOG 
GACACCTTCT 
GAGATCCGCT 
TCCCAGAGCT 
CCTGAGTCTA 
CTCCCCCAGC 
AAAGCAAGGT 
ACATGCCAGG 
CAGGGTTGGC 
TTCTTAGGGG 
CCCTCTAGGC 
CCCCAGGGAT 
ACCCTGCACT 
CTGCACAGTG 
TTATGTGAAA 
GTCGGGGGGG 
ACACGCCTGC 
AGTATTCCAG 
GCCAACTCCT 
TTCTGTCCTT 
AAGCACTCAT 



AGCTCATCCT 
AGGATGAACC 
AACCCTTTGG 
TGCTCCTGGT 
GCCAGGGCAC 
TTGCCGTGGT 
ACATCTTCTT 
ACCTGACGCC 
GCGACCCCCA 
CCTGCGTGGC 
CCCTGCAGAG 
GCCGCATCAT 
TGGCCATTGG 
ACGTGCAGGA 
TGAACGTGCA 
CCCTGTACGC 
GCGGCAGCTT 
GGGGAGTCGG 
CCCACCACCG 
ACCCACCCTA 
GCACCCTCCT 
GCCATGAGGG 
CTCAGGTTGC 
CTTGCTTCTA 
TGTCCTCACT 
CAGGCCCAGT 
CATCAGCCCT 
ACACGCGCAC 
CCTCTCACAT 
CACCCACAGC 
TTAGTGGGAC 
GTTGCTGGAA 
CACATAGAAG 
ATGTAAGACC 
ATGAGCTGCA 
GCGATCAGCT 
TAAAACGTTC 
AGCCCAGATA 



360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 



1 
I 

MKDEPRSTNL 
ISQGTEKVLA 
KHLTPVTLEL 
PALQSTITRP 
VDVQETEPVM 
SSGSFGGNBG 
RYPPYTDWNQ 



11 
I 

FMKLDSVFIW 
EVLPQYLDQS 
GGKNPCYVDD 
YGDDPQSSPN 
QEEIFGPILP 
FTYISLLSVP 
QLLRWGMGSQ 



21 
I 

KEPFGLVLII 
CPAWLGGPQ 
NCDPQTVANR 
LGRIINQKQP 
IVNVQSVDEA 
FGGVGHSGMG 
SCTLL 



31 
I 

APWNYPLNLT 
ETGQLLEHKL 
VAWPCYPNAG 
QRLRALLGCG 
IKFINRQEKP 
RYHGKFTFDT 



41 
I 

LVLLVGTLPA 
DYIFFTGSPR 
QTCVAPDYVL 
RVAIGGQSNB 
LALYAFSNSR 
FSHHRTCLLA 



51 
I 

GNCWLKPSE 
VGKIVMTAAT 
CSPEMQERLL 
SDRYIAPTVL 
QWNQMLERT 
PSGLEKLKEI 



Seq ID NO: 112 DNA sequence 
Nucleic Acid Accession ft: NM_004456 
Coding sequence: 58-2298 



GAATTCCGGG 
GGCCAGACTG 
GAGTACATGC 
TTTAGTTCCA 
CAGCGAAGGA 
GAGTGTTCGG 
AATGCAGTTG 
GTGGAAGATG 
GATGGTACTT 
GAATGTGGGT 
AATGATGATG 
GATCTGGAGG 
AAAATTTTGG 
GAAAAATATA 
CCCAACATAG 
CATAOGCTTT 
ACACCCAACA 
CCACAGTGTT 
CGGATAAAGA 
' AGTAGCAGGC 
AGGGAAGCAG 
GATGAAACTT 
CCAAATATTG 
GTCCTCATTG 
ACATGTAGAC 
GCTGAGGATG 
CACTGCAGAA 



11 
I 

CGACGCGCGG 
GGAAGAAATC 
GACTGAGACA 
ATCGTCAGAA 
TACAGCCTGT 
TGACCAGTGA 
CTTCAGTACC 
AAACTGTTTT 
TCATTGAAGA 
TTATAAATGA 
ACGATGATGA 
ATCACCGAGA 
AGGCCATTTC 
AAGAACTCAC 
ATGGACCAAA 
TCTGTAGGCG 
CTTATAAGCG 
ACCAGCATTT 
CCCCACCAAA 
CCAGCACCCC 
GGACTGAAAC 
CGAGCTCCTC 
AACCTCCTGA 
GCACTTACTA 
AGGTGTATGA 
TGGATACTCC 
AGATACAGCT 



21 
I 

GAACAACGCG 
TGAGAAGGGA 
GCTCAAGAGG 
AATTTTGGAA 
GCACATCCTG 
CTTGGATTTT 
CATAATGTAT 
ACATAACATT 
ACTAATAAAA 
TGAAATTTTT 
TGATGGAGAC 
TGATAAAGAA 
CTCAATGTTT 
CGAACAGCAG 
TGCTAAATCT 
ATGTTTTAAA 
GAAGAACACA 
GGAGGGAGCA 
ACGTCCAGGA 
CACCATTAAT 
GGGGGGAGAG 
TGAAGCAAAT 
GAATGTGGAG 
TGACAATTTC 
GTTTAGAGTC 
TCCAAGGAAA 
GAAAAAGGAC 



31 

I 

AGTCGGCGCG 
CCAGTTTGTT 
TTCAGACGAG 
AGAACGGAAA 
ACTTCTGTGA 
CCAACACAAG 
TCTTGGTCTC 
CCTTATATGG 
AATTATGATG 
GTGGAGTTGG 
GATCCTGAAG 
AGCCGCCCAC 
CCAGATAAGG 
CTCCCAGGCG 
GTTCAGAGAG 
TATGACTGCT 
GAAACAGCTC 
AAGGAGTTTG 
GGCCGCAGAA 
GTGCTGGAAT 
AACAATGATA 
TCTCGGTGTC 
TGGAGTGGTG 
TGTGCCATTG 
AAAGAATCTA 
AAGAAGAGGA 
GGCTCCTCTA 



41 - 
I 

CGGGACGAAG 
GGCGGAAGCG 
CTGATGAAGT 
TCTTAAACCA 
GCTCATTGCG 
TCATCCCATT 
CCCTACAGCA 
GAGATGAAGT 
GGAAAGSTACA 
TGAATGCCCT 
AAAGAGAAGA 
CTCGGAAATT 
GCACAGCAGA 
CACTTCCTCC 
AGCAAAGCTT 
TCCTACATCC 
TAGACAACAA 
CTGCTGCTCT 
GAGGACGGCT 
CAAAGGATAC 
AAGAAGAAGA 
AAACACCAAT 
CTGAAGCCTC 
CTAGGTTAAT 
GCATCATAGC 
AACACOGGTT 
ACCATGTTTA 



51 
I 

AATAATCATG 
TGTAAAATCA 
AAAGAGTATG 
AGAATGGAAA 
CGGGACTAGG 
AAAGACTCTG 
GAATTTTATG 
TTTAGATCAG 
CGGGGATAGA 
TGGTCAATAT 
AAAGCAGAAA 
TCCTTCTGAT 
AGAACTAAAG 
TGAATGTACC 
ACACTCCTTT 
TTTTCATGCA 
ACCTTGTGGA 
CACCGCTGAG 
TCCCAATAAC 
AGACAGTGAT 
AGAGAAGAAA 
AAAGATGAAG 
AATGTTTAGA 
TGGGACCAAA 
TCCAGCTCCC 
GTGGGCTGCA 
CAACTATCAA 



60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 



231 



10 



15 



WO 02/086443 

CCCTGTGATC ATCCACGGCA GCCTTGTGAC AGTTOGTGCC CTTGTGTGAT AGCACAAAAT 1680 

TTTTOTGAAA AGTTTTGTCA ATGTAGTTCA GAGTGTCAAA ACOGCTTTCC GGGATGCCGC 1740 

TGCAAAGCAC AGTGCAACAC CAAGCAGTGC CCGTGCTACC TGGCTGTCCG AGAGTGTGAC 1800 

CCTGACCTCT GTCTTACTTQ TGGAGCCGCT GACCATTGGG ACAGTAAAAA TGTOTCCTGC 1860 

AAGAACTGCA GTATTCAGCG GGGCTCCAAA AAGCATCTAT TGCTGGCACC ATCTGACGTG 1920 

GCAGGCTGGG GGATTTTTAT CAAAGATCCT GTGCAGAAAA ATGAATTCAT CTCAGAATAC 1980 

TGTGGAGAGA TTATTTCTCA AGATGAAGCT GACAGAAGAG GGAAAGTGTA TGATAAATAC 2040 

ATGTGCA6CT TTCTGTTCAA CTTGAACAAT GATTTTGTGG TGGATGCAAC CCGCAAGGGT 2100 

AACAAAATTC GTTTTGCAAA TCATTCGGTA AATCCAAACT GCTATGCAAA AGTTATGATG 2160 

GTTAAOGGTG ATCACAGGAT AGGTATTTTT GCCAAGAGAG CCATCCAGAC TGGCGAAGAG 2220 

CTGTTTGTTG ATTACAGATA CAGCCAGGCT GATGCCCTGA AGTATGTCGG CATCGAAAGA 2280 

GAAATGGAAA TCCCTTGACA TCTGCTACCT CCTCCCCCTC CTCTGAAACA GCTGCCTTAG 2340 

CTTCAGGAAC CTCGAGTACT GTGGGCAATT TAGAAAAAGA ACATGCAGTT TGAAATTCTG 2400 

AATTTGCAAA GTACTGTAAG AATAATTTAT AGTAATGAGT TTAAAAATCA ACTTTTTATT 2460 

GCCTTCTCAC CAGCTGCAAA GTGTTTTGTA CCAGTGAATT TTTGCAATAA TGCAGTATGG 2520 
TACATTTTTC AACTTTGAAT AAAGAATACT TGAACTTGAA AAAAAAAAAA AAAAAA 



PCT/US02/12476 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seg ID NO i 113 Protein sequence: 
Protein Accession 8: NP 004447 



MGQTGKKSEK 
KQRRIQPVHI 
MVEDETVLHN 
YNDDDDDDDG 
KEKYKELTEQ 
ATPNTYKRKN 
NSSRPSTPTI 
KPNIEPPENV 
PAEDVDTPPR 
NFCEKFCQCS 
CKNCSIQRGS 
YMCSFLFNLN 
ELFVDYRYSQ 



11 
I 

GPVCWRKRVK 
LTSVSSLRGT 
IPYMGDEVLD 
DDPEEREEKQ 
QLPGAIiPPEC 
TETAIJDNKPC 
NVLESKDTDS 
EWSGAEASMF 
KKKRKHRLWA 
SECQNRFPGC 
KKHLLLAPSD 
NDFWDATRK 
ADALKYVGIE 



21 
I 

SEYMRLRQLK 
RECSVTSDLD 
QDGTFIEELI 
KDLEDHRDDK 
TPNIDGPNAK 
GPQCYQHLEG 
DREAGTETGG 
RVLIGTYYDN 
AHCRKIQLKK 
RCKAQCNTKQ 
VAGWGIFIKD 
GNKIRFANHS 
REMEIP 



31 
I 

RFRRADEVKS 
FPTQVIPLKT 
KNYDGKVHGD 
ESRPPRKFPS 
SVQREQSLHS 
AKEFAAALTA 
ENNDKEEEEK 
FCAIARLIGT 
DGSSNHVYNY 
CPCYLAVREC 
PVQKNEFISE 
VNPNCYAKVM 



41 
I 

MFSSNRQKIL 
LNAVASVPIM 
RECGFINDEI 
DKILEAISSM 
FHTLFCRRCF 
ERIKTPPKRP 
KDETSSSSEA 
KTCRQVYEFR 
QPCDHPRQPC 
DPDLCLTCGA 
YCGEIISQDE 
MVNGDHRIGI 



51 
I 

ERTEILNQEW 
YSWSPLQQNF 
FVELVNALGQ 
FPDKGTAEEL 
KYDCFLHPFH 
GGRRRGRLPN 
NSRCQTPIKM 
VKESSIIAPA 
DSSCPCVIAQ 
ADHWDSKNVS 
ADRRGKVYDK 
FAKRAIQTGE 



Seq ID NO: 114 DNA Bequence 
Nucleic Acid Accession #: NM_001827 
Coding sequence: 96-335 



AGTCTCCGGC 
CGCTCTCGTT 
CGGACAAGTA 
CCAAACAAGT 
AACAGAGTCT 
TTAGACGACC 
TTTTCAAATT 
ACAAATCTTT 
AAATGCAACT 
TTTCTCTTAA 
TATGTTGCAT 



11 
I 

GAGTTGTTGC 
TCATTTTCTG 
CTTCGACGAA 
ACCTAAAACT 
AGGCTGGGTT 
TCTTCCAAAA 
TAATGTATAT 
CATCCATACC 
GCAAGTAGGT 
GTGCCTGTTT 
TTAAAAAAAA 



21 
I 

CTGGGCTGGA 
CAGCGCGCCA 
CACTACGAGT 
CATCTGATGT 
CATTACATGA 
GATCAACAAA 
GTGTATATAA 
TGTGCATGAG 
TACTGTAAGA 
GAGTTTACTG 
AAAAAAA 



31 

I 

CGTGGTTTTG 
CGAGGATGGC 
ACCGGCATGT 
CTGAAGAGGA 
TTCATGAGCC 
AATGAAGTTT 
GGTAGTATTC 
CTGTATTCTT 
TGTTTAAGAT 
AAACAGTTTA 



41 
I 

TCTGCTGCGC 
CCACAAGCAG 
TATGTTACCC 
GTGGAGGAGA 
AGAACCACAT 
ATCTGGGGAT 
AGTGAATACT 
CACAGCAACA 
AAAAGTTCTT 
CTTTTGTTCA 



51 
I 

CCGCTCTTCG 
ATCTACTACT 
AGAGAACTTT 
CTTGGTGTCC 
ATTCTTCTCT 
CGTCAAATCT 
TGAGAAATGT 
GAGCTCAGTT 
CCAGTCAGTT 
ATAAAGTTTG 



Seq ID NO: 115 Protein sequence: 
Protein Accession jh NP_001818 



11 



21 
I 



31 



41 



51 



MAHKQIYYSD KYFDEHYEYR HVMLPREIiSK QVPKTHLMSE EEWRRLGVQQ SLGWVHYMIH 
EPEPHILLFR RPLPKDQQK 

Seq ID NO: 116 DNA sequence 

Nucleic Acid Accession #: CAT cluster 



TCAGACCTCA 
GCATCTGGAC 
AGAGGTGTGT 
GCAGCCAACA 
AAGGACTGAT 
CCTTGGAAGG 
TCAAGAATTC 
AGTGTGGGAG 
CCCATCTCTA 
CCTGTAGTTC 
AGGCTGCAAT 
CCTGTCTCAA 
TTTGAGGTGC 
CCTGAAGGAG 
GACAGACCTT 
CTCTTCCCCC 
CCGCCTCCCA 
CAGCCGGATA 
TCCTGCTCCG 



11 
I 

TGAGTCACTT 
CCTTGGTGCT 
TCCAGGGAAA 
GAGTTCAAAA 
CCACATTCCC 
ACCTGGCTCA 
TTTGCTGAGC 
GATCTCTTGA 
AAATAATAAT 
CAGCTACCCA 
GAACTGTGAT 
ATAATAATAA 
CATTTGGGTA 
CAGAGGGATG 
GTCCTTCTTC 
TCCCTGTCCC 
TGTCTGCTGT 
CAGAGTGAAT 
TGTAAAGAGG 



21 
I 

GGACTCTTGA 
ATCGACGAAG 
GCCCCTATCT 
TGCAGGCTTG 
ACCAGGAAGT 
GGCTGGACCA 
ATGGTGCCTC 
GCCCAGGAGT 
AATAATAAAA 
GGAGGCTGAG 
TACCCCACTG 
TAATAATAAT 
GAAAGAAAAG 
CATCGCTGGA 
CTTGTGGAAA 
AGGGAACCAA 
GCCTTTGTAC 
AGTTAACCAC 
CCAGTGTTTG 



31 
I 

GCCACCTCTG 
CTTGGGTGGG 
CTCTGCAGAG 
GAAAGTACAG 
TTAGCAGAAC 
CCTCTTGAGA 
ATGCCTATAA 
TCAAGACTAG 
TAAAAAATTA 
GCAAGAGGAT 
CACTCCAGCC 
CTTATTTTGG 
ACGTTTACAC 
GGTGACCTAC 
GTGTTTCCTC 
AGGGCTTTCT 
TCAGCAATTC 
ACTTAGGTCA 
TGTGTTGCAA 



41 

I 

GGGGTGGAGT 
GCTCTTAGCT 
GTCAAGTGAA 
GGGGCTCTGT 
CCCCGCGTGC 
GGGAGGAGCT 
TACCAACACT 
CCTGGGCAAC 
GCAGGGCATG 
GGCTGGAGCC 
TGGGCAAAAG 
AGAATAAAGA 
CGAGAAATAG 
AGTTGAAGAA 
TGCTGCTACT 
ACCACACCCT 
TTGTTTGCTC 
AATAGGATCT 
GCAGCCTTGG 



51 
I 

CTCTCTCCTG 
GCTATGTGCA 
AGCGACGGCC 
GGAGGATGGG 
CAACTGGACC 
CTGGATTTGA 
TTGGGAGGCC 
ACAGAGAGAA 
GTGGCATGTG 
TGGGATGTTG 
AGCGAGAGAA 
GACCTCTGGA 
TCTGTGTTGC 
GACTCATTAT 
GCTCATGAGA 
TTCTTGCCCC 
CATTATCTTC 
AAATTTTTGT 
AATAGTAACT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 



60 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
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CTTCTCATTT GTTTGGGATC TGGCCACCAA GTTCCAGAAT GATACACGGA TCAGTGCAGA 1200 

AGTTCATCAG GCTCTCGGAC CTTAGGGCTG TTGGAGAAGG CTTCAGCAGC AGAACTGATG 1260 

GTGAAGGCTC GTGTTCTCCA TCCTCAACTT TCTTTGCTTC GATCATACAC AAGAATACAT 1320 

TTGGAAGGGC AAAAAATGAA CACTGTCGTT CATTGCAGCC GTGTTTTGTG ACACAGATGC 1380 

ACAGTCTGCT GTGAAGACCT TCTCTCAAGT GGCATTTGGG AGTCCATGCC AGATCATGGT 1440 

GCTTCATGAG AGACTGACAG CTATCAGGGG TTGTGGCACT TAGTGAGGAC TCTCCTCCCC 1500 

CAGTGTGTGC TGATGACACA TACACACCTG ACAATAGCTT GAGTCTTCTC TGTTCCTTTT 1560 

ACTCTGTAGC CAACATACAC ATGATTTAAA ACCCTTTCTA AATATCTATC ATGGTTCATC 1620 

CTTGTCCAAA TGCAGAGTCA GAGCTATTTG TACTTCATTA TTATTTCCAA GGCGAATAGT 1680 

TGGCTTTCTT TTTGCAAAAA TAATTAAAGT TTTTGTATGT TGCAAAAAAA AAAAAAAAAA 1740 
AAACAAAAAA 

Seq ID NO: 117 DNA sequence 

Nucleic Acid Accession #: BC012178.1 

Coding sequence: 204-2285 

1 11 21 31 41 51 

I I I I I I 

CTTCTCTCCC GOGGCGCTGG GGCCCGCGCT CCGCTGCTGT TGCTCCATTC GGCGCTTTTC 60 

TGGCGGCTGG CTCCTCTCCG CTGCCGGCTG CTCCTCGACC AGGCCTCCTT CTCAACCTCA 120 

GCCCGCGGCG CCGACCCTTC CGGCACCCTC CCGCCCCGTC TCGTACTGTC GCCGTCACCG 180 

CCGCGGCTCC GGCCCTGGCC CCGATGGCTC TGTGCAACGG AGACTCCAAG CTGC3AGAATG 240 

CTGGAGGAGA CCTTAAGGAT GGCCACCACC ACTATGAAGG AGCTGTTGTC ATTCTGGATG 3 00 

CTGGTGCTCA GTACGGGAAA GTCATAGACC GAAGAGTGAG GGAACTGTTC GTGCAGTCTG 3 60 

AAATTTTCCC CTTGGAAACA CCAGCATTTG CTATAAAGGA ACAAGGATTC CGTGCTATTA 420 

TCATCTCTGG AGGACCTAAT TCTGTGTATG CTGAAGATGC TCCCTGGTTT GATCCAGCAA 480 

TATTCACTAT TGGCAAGCCT GTTCTTGGAA TTTGCTATGG TATGCAGATG ATGAATAAGG 540 

TATTTGGAGG TACTGTGCAC AAAAAAAGTG TCAGAGAAGA TGGAGTTTTC AACATTAGTG 600 

TGGATAATAC ATGTTCATTA TTCAGGGGCC TTCAGAAGGA AGAAGTTGTT TTGCTTACAC 660 

ATGGAGATAG TGTAGACAAA GTAGCTGATG GATTCAAGGT TGTGGCACGT TCTGGAAACA 720 

TAGTAGCAGG CATAGCAAAT GAATCTAAAA AGTTATATGG AGCACAGTTC CACCCTGAAG 780 

TTGGCCTTAC AGAAAATGGA AAAGTAATAC TGAAGAATTT CCTTTATGAT ATAGCTGGAT 840 

GCAGTGGAAC CTTCACCGTG CAGAACAGAG AACTTGAGTG TATTCGAGAG ATCAAAGAGA 900 

GAGTAGGCAC GTCAAAAGTT TTGGTTTTAC TCAGTGGTGG AGTAGACTCA ACAGTTTGTA 960 

CAGCTTTGCT AAATCGTGCT TTGAACCAAG AACAAGTCAT TGCTGTGCAC ATTGATAATG 1020 

GCTTTATGAG AAAACGAGAA AGCCAGTCTG TTGAAGAGGC CCTCAAAAAG CTTGGAATTC 1080 

AGGTCAAAGT GATAAATGCT GCTCATTCTT TCTACAATGG AACAACAACC CTACCAATAT 1140 

CAGATGAAGA TAGAACCCCA CGGAAAAGAA TTAGCAAAAC GTTAAATATG ACCACAAGTC 1200 

CTGAAGAGAA AAGAAAAATC ATTGGGGATA CTTTTGTTAA GATTGCCAAT GAAGTAATTG 1260 

GAGAAATGAA CTTGAAACCA GAGGAGGTTT TCCTTGCCCA AGGTACTTTA CGGCCTGATC. 1320 

TAATTGAAAG TGCATCCCTT GTTGCAAGTG GCAAAGCTGA ACTCATCAAA ACCCATCACA 1380 

ATGACACAGA GCTCATCAGA AAGTTGAGAG AGGAGGGAAA AGTAATAGAA CCTCTGAAAG 1440 

ATTTTCATAA AGATGAAGTG AGAATTTTGG GCAGAGAACT TGGACTTCCA GAAGAGTTAG 1500 

TTTCCAGGCA TCCATTTCCA GGTCCTGGCC TGGCAATCAG AGTAATATGT GCTGAAGAAC 1560 

CTTATATTTG TAAGGACTTT CCTGAAACCA ACAATATTTT GAAAATAGTA GCTGATTTTT 1620 

CTGCAAGTGT TAAAAAGCCA CATACCCTAT TACAGAGAGT CAAAGCCTGC ACAACAGAAG 1680 

AGGATCAGGA GAAGCTGATG CAAATTACCA GTCTGCATTC ACTGAATGCC TTCTTGCTGC 1740 

CAATTAAAAC TGTAGGTGTG CAGGGTGACT GTCGTTCCTA CAGTTACGTG TGTGGAATCT 1800 

CCAGTAAAGA TGAACCTGAC TGGGAATCAC TTATTTTTCT GGCTAGGCTT ATACCTCGCA 1860 

TGTGTCACAA CGTTAACAGA GTTGTTTATA TATTTGGCCC ACCAGTTAAA GAACCTCCTA 1920 

CAGATGTTAC TCCCACTTTC TTGACAACAG GGGTGCTCAG TACTTTACGC CAAGCTGATT 1980 

TTGAGGCCCA TAACATTCTC AGGGAGTCTG GGTATGCTGG GAAAATCAGC CAGATGCCGG 2040 

TGATTTTGAC ACCATTACAT TTTGATCGGG ACCCACTTCA AAAGCAGCCT TCATGCCAGA 2100 

GATCTGTGGT TATTCGAACC TTTATTACTA GTGACTTCAT GACTGGTATA CCTGCAACAC 2160 

CTGGCAATGA GATCCCTGTA GAGGTGGTAT TAAAGATGGT CACTGAGATT AAGAAGATTC 2220 

CTGGTATTTC TCGAATTATG TATGACTTAA CATCAAAGCC CCCAGGAACT ACTGAGTGGG 2280 
AGTAATAAAC TTCTTGTTCT ATTAAAA 



Seq ID NO i 118 Protein sequence: 
Protein Accession #: AAH12178.1 

1 11 21 31 41 51 

I I I I I I 

MALCNGDSKL ENAGGDLKDG HHHYEGAWI LDAGAQYGKV IDRRVRELFV QSEIFPLETP 60 

AFAIKEQGFR AIIISGGPNS VYAEDAPWFD PAIFTIGKPV LGI CYGMQMM NKVFGGTVHK 120 

KSVREDGVFN ISVDNTCSLP RGLQKEBWL LTHGDSVDKV ADGFKWARS GNIVAGIANH 180 

SKKLYGAQFH PEVGLTENGK VILKNFLYDI AGCSGTFTVQ NREIiECIREI KERVGTSKVL 240 

VLLSGGVDST VCTALLNRAL NQEQVIAVHI DNGFMRKRES QSVEEALKKL GIQVKVINAA 300 

HSFYNGTTTL PISDEDRTPR KRISKTLNMT TSPEEKRKI I GDTFVKIANE VIGEMNLKPE 360 

EVFLAOGTLR PDLIESASLV ASGKAELIKT HHNDTELIRK LREEGKVIEP LKDFHKDEVR 420 

ILGRELGIjPE ELVSRHPPPG PGLAIRVICA EEPYICKDFP ETNNILKIVA DPSASVKKPH 480 

TLLQRVKACT TEEDQEKLMQ ITSLHSLiNAF LLPIKTVGVQ GDCRSYSYVC GISSKDEPDW 540 

ESLIFLARLI PRMCHNVNRV VYIFGPPVKE PPTDVTPTFL TTGVLSTLRQ ADFEAHNILR 600 

ESGYAGKISQ MPVILTPLHF DRDPLQKQPS CQRSWIRTF ITSDFMTGIP ATPGNEIPVE 660 
WliKMVTEIK KIPGISRIMY DLTSKPPGTT EWE 

Seq ID NOx 119 DNA sequence 

Nucleic Acid Accession #: NMJ30650O.1 

Coding sequence: 27. .1967 

1 11 21 31 41 51 

I I I I I I 

ACTTGOGTCT CGCCCTCCGG CCAAGCATGG GGCTTCCCAG GCTGGTCTGC GCCTTCTTGC 60 
TCGCCGCCTG CTGCTGCTGT CCTCGCGTCG CGGGTGTGCC CGGAGAGGCT GAGCAGCCTG 120 
CGCCTGAGCT GGTGGAGGTG GAAGTGGGCA GCACAGCCCT TCTGAAGTGC GGCCTCTCCC 180 
AGTCCCAAGG CAACCTCAGC CATGTCGACT GGTTTTCTGT CCACAAGGAG AAGCGGACGC 240 
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TCATCTTCOG TGTGCGCCAG GGCCAGGGCC AGAGCGAACC TGGGGAGTAC GAGCAGCGGC 300 

TCAGCCTCCA GGACAGAGGG GCTACTCTGG CCCTGACTCA AGTCACCCCC CAAGACGAGC 360 

GCATCTTCTT GTGCCAGGGC AAGCGCCCTC GGTCCCAGGA GTACOGCATC CAGCTCCGCG 420 

TCTACAAAGC TCCGGAGGAG CCAAACATCC AGGTCAACCC CCTGGGCATC CCTGTGAACA 480 

GTAAGGAGCC TGAGGAGGTC GCTACCTGTG TAGGGAGGAA CGGGTACCCC ATTCCTCAAG 540 

TCATCTGGTA CAAGAATGGC CGGCCTCTGA AGGAGGAGAA GAACOGGGTC CACATTCAGT 600 

CGTCCCAGAC TGTGGAGTCG AGTGGTTTGT ACACCTTGCA GAGTATTCTG AAGGCACAGC 660 

TGGTTAAAGA AGACAAAGAT GCCCAGTTTT ACTGTGAGCT CAACTACOGG CTGCCCAGTG 720 

GGAACCACAT GAAGGAGTCC AGGGAAGTCA CCGTCCCTGT TTTCTACCCG ACAGAAAAAG 780 

TGTGGCTGGA AGTGGAGCCC GTGGGAATGC TGAAGGAAGG GGACCGCGTG GAAATCAGGT 840 

GTTTGGCTGA TGGCAACCCT CCACCACACT TCAGCATCAG CAAGCAGAAC CCCAGCACCA 900 

GGGAGGCAGA GGAAGAGACA ACCAACGACA ACGGGGTCCT GGTGCTGGAG CCTGCCCGGA 960 

AGGAACACAG TGGGCGCTAT GAATGTCAGG CCTGGAACTT GGACACCATG ATATCGCTGC 1020 

TGAGTGAACC ACAGGAACTA CTGGTGAACT ATGTGTCTGA CGTCCGAGTG AGTCCCGCAG 1080 

CCCCTGAGAG ACAGGAAGGC AGCAGCCTCA CCCTGACCTG TGAGGCAGAG AGTAGCCAGG 1140 

ACCTCGAGTT CCAGTGGCTG AGAGAAGAGA CAGACCAGGT GCTGGAAAGG GGGCCTGTGC 1200 

TTCAGTTGCA TGACCTGAAA CGGGAGGCAG GAGGCGGCTA TCGCTGCGTG GCGTCTGTGC 1260 

CCAGCATACC CGGCCTGAAC CGCACACAGC TGGTCAAGCT GGCCATTTTT GGCCCCCCTT 1320 

GGATGGCATT CAAGGAGAGG AAGGTGTGGG TGAAAGAGAA TATGGTGTTG AATCTGTCTT 1380 

GTGAAGCGTC AGGGCACCCC CGGCCCACCA TCTCCTGGAA CGTCAACGGC ACGGCAAGTG 1440 

AACAAGACCA AGATCCACAG CGAGTCCTGA GCACCCTGAA TGTCCTCGTG ACCCCGGAGC 1500 

TGTTGGAGAC AGGTGTTGAA TGCACGGCCT CCAACGACCT GGGCAAAAAC ACCAGCATCC 1560 

TCTTCCTGGA GCTGGTCAAT TTAACCACCC TCACACCAGA CTCCAACACA ACCACTGGCC 1620 

TCAGCACTTC CACTGCCAGT CCTCATACCA GAGCCAACAG CACCTCCACA GAGAGAAAGC 1680 

TGCCGGAGCC GGAGAGCCGG GGCGTGGTCA TCGTGGCTGT GATTGTGTGC ATCCTGGTCC 1740. 

TGGCGGTGCT GGGCGCTGTC CTCTATTTCC TCTATAAGAA GGGCAAGCTG CCGTGCAGGC 1800 

GCTCAGGGAA GCAGGAGATC ACGCTGCCCC CGTCTCGTAA GACCGAACTT GTAGTTGAAG 1860 

TTAAGTCAGA TAAGCTCCCA GAAGAGATGG GCCTCCTGCA GGGCAGCAGC GGTGACAAGA 1920 

GGGCTCCGGG AGACCAGGGA GAGAAATACA TCGATCTGAG GCATTAGCCC CGAATCACTT 1980 

CAGCTCCCTT CCCTGCCTGG ACCATTCCCA GCTCCCTGCT CACTCTTCTC TCAGCCAAAG 2040 

CCTCCAAAGG GACTAGAGAG AAGCCTCCTG CTCCCCTCAC CTGCACACCC CCTTTCAGAG 2100 

GGCCACTGGG TTAGGACCTG AGGACCTCAC TTGGCCCTGC AAGCCGCTTT TCAGGGACCA 2160 

GTCCACCACC ATCTCCTCCA CGTTGAGTGA AGCTCATCCC AAGCAAGGAG CCCCAGTCTC 2220 

CCGAGCGGGT AGGAGAGTTT CTTGCAGAAC GTGTTTTTTC TTTACACACA TTATGGCTGT 2280 

AAATACCTGG CTCCTGCCAG CAGCTGAGCT GGGTAGCCTC TCTGAGCTGG TTTCCTGCCC 2340 

CAAAGGCTGG CTTCCACCAT CCAGGTGCAC CACTGAAGTG AGGACACACC GGAGCCAGGC 2400 

GCCTGCTCAT GTTGAAGTGC GCTGTTCACA CCCGCTCCGG AGAGCACCCC AGCGGCATCC 2460 

AGAAGCAGCT GCAGTGTTGC TGCCACCACC CTCCTGCTCG CCTCTTGAAA GTCTCCTGTG 2520 

ACATTTTTTC TTTGGTCAGA AGCCAGGAAC TGGTGTCATT CCTTAAAAGA TACGTGCCGG 2580 

GGCCAGGTGT GGTGGCTCAC GCCTGTAATC CCAGCACTTT GGGAGGCCGA GGCGGGCGGA 2640 

TCACAAAGTC AGGACGAGAC CATCCTGGCT AACACGGTGA AACCCTGTCT CTACTAAAAA 2700 

TACAAAAAAA AATTAGCTAG GCGTAGTGGT TGGCACCTAT AGTCCCAGCT ACTCGGAAGG 2760 

CTGAAGCAGG AGAATGGTAT GAATCCAGGA GGTGGAGCTT GCAGTGAGCC GAGACCGTGC 2820 

CACTGCACTC CAGCCTGGGC AACACAGCGA GACTCCGTCT CGAGGAAAAA AAAAGAAAAG 2880 

ACGCGTACCT GCGGTGAGGA AGCTGGGCGC TGTTTTCGAG TTCAGGTGAA TTAGCCTCAA 2 940 

TCCCCGTGTT CACTTGCTCC CATAGCCCTC TTGATGGATC ACGTAAAACT GAAAGGCAGC 3000 

GGGGAGCAGA CAAAGATGAG GTCTACACTG TCCTTCATGG GGATTAAAGC TATGGTTATA 3060 

TTAGCACCAA ACTTCTACAA ACCAAGCTCA GGGCCCCAAC CCTAGAAGGG CCCAAATGAG 3120 

AGAATGGTAC TTAGGGATGG AAAACGGGGC CTGGCTAGAG CTTCGGGTGT GTGTGTCTGT 3180 

CTGTGTGTAT GCATACATAT GTGTGTATAT ATGGTTTTGT CAGGTGTGTA AATTTGCAAA 3240 

TTGTTTCCTT TATATATGTA TGTATATATA TATATGAAAA TATATATATA TATGAAAAAT 3300 

AAAGCTTAAT TGTCCCAGAA AATCATACAT TGCTTTTTTA TTCTACATGG GTACCACAGG 3360 

AACCTGGGGG CCTGTGAAAC TACAACCAAA AGGCACACAA AACCGTTTCC AGTTGGCAGC 3420 

AGAGATCAGG GGTTACCTCT GCTTCTGAGC AAATGGCTCA AGCTCTACCA GAGCAGACAG 3480 

CTACCCTACT TTTCAGCAGC AAAACGTCCC GTATGACGCA GCACGAAGGG CCTGGCAGGC 3540 
TGTTAGCAGG AGCTATGTCC CTTCCTATCG TTTCCGTCCA CTT 



Seq ID NO: 120 Protein sequence: 
Protein Accession 8: NP_006491.1 



1 11 21 31 41 51 

I I I I I I 

MGLPRLVCAP LLAACCCCPR VAGVPGEAEQ PAPELVEVEV GSTALLKCGL SQSQGNLSHV 60 

DWFSVHKEKR TLIFRVRQGQ GQSEPGEYEQ RLSLQDRGAT LALTQVTPQD ERIFLCQGKR 120 

PRSQEYRIQL RVYKAPEEPN IQVNPIiGIPV NSKEPEEVAT CVGRNGYPIP QVIWYKNGRP 180 

LKEEKNRVHI QSSQTVESSG LYTLQSILKA QLVKEDKDAQ FYCELNYRLP SGNHMKESRE 240 

VTVPVPYPTE KVWLEVEPVG MLKEGDRVEI RCLADGNPPP HPSISKQNPS TREAEEETTN 300 

DNGVLVLEPA RKEHSGRYEC QAWNLDTMIS LLSEPQELLV NYVSDVRVSP AAPERQEGSS 360 

LTLTCEAESS QDLEFQWLRE ETDQVLERGP VLQLHDLKRE AGGGYRCVAS VPSIPGLNRT 420 

QLVKLAIFGP PWMAFKERKV WVKENMVLNL SCEASGHPRP TISWNVNGTA SEQDQDPQRV 480 

LSTLNVLVTP ELLETGVECT ASNDLGKNTS ILFLELVNLT TLTPDSNTTT GLSTSTASPH 540 

TRANSTSTER KLPEPESRGV VIVAVIVCIL VLAVLGAVLY FLYKKGKLPC RRSGKQEITL 600 
PPSRKTELW EVKSDKLPEE MGLLQGSSGD KRAPGDQGEK YIDLRH 

Seq ID NO i 121 DNA sequence 
Nucleic Acid Accession #: NMJJ18306 
Coding sequence: 60-671 

1 11 21 31 41 51 

I I I I I I 

ATAGTCTACA CAGAGCTCCC CTTGCTGCCC AGACAAGCTG AAGGACCACA GGAAAAGCCA 60 
TGGAGACTTC AGCATCCTCC TCCCAGCCTC AGGACAACAG TCAAGTCCAC AGAGAAACAG 120 
AAGATGTAGA CTATGGAGAG ACAGATTTCC ACAAGCAAGA CGGGAAGGCT GGACTCTTTT 180 
CCCAAGAACA ATATGAGAGA AACAAGTCTT CTTCCTCCTC CTTCTCTTCC TCCTCATCCT 240 
CCTCATCTTC TTCATCCTCC TCCTCCTCAG GTCCTGGGCA TGGGGAGCCT GACGTTTTGA 300 
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AGGATGAGCT TCAACTCTAT GGAGAT6CTC CTGGAGAGGT GGTACCCTCT GGGGAATCAG 360 

GACTCCGAAG GAGAGGCTCT GACCCAGCAA GTGGAGAAGT GGAGGCCTCT CAGTTAAGAA 420 

GACTGAATAT AAAGAAAGAT GATGAGTTTT TCCATTTCGT CCTCCTGTGC TTTGCCATCG 480 

GGGCCTTGCT GGTGTGTTAT CACTATTACG CAGACTGGTT CATGTCTCTT GGGGTCGGCC 540 

TGCTCACCTT CGCCTCCCTG GAAACCGTTG GCATCTACTT CGGACTAGTG TACCGTATCC 600 

ACAGCGTCCT CCAAGGCTTC ATCCCCCTCT TCCAGAAGTT TAGGCTGACA GGGTTCAGGA 660 

AGACTGACTG AGGCCACTTC CAGGTGGGCA GCAGAGGCAG GCCCCAGTGT GACCACCACT 720 

GCGACCCCTG AGCCCACAAG GGCAGAGCAG CATTCTGAGA GACGCACAGG AGACCAAGCC 780 

AGACCAATAA ACAGAACACT TTTCCTTCCA TGTGGTCTGA ATGTTGGCAC CAGCCCGGGC 840 

AGGGGCATCT CATTTGGGCA GTACTGCTGT GCAACCCAGC TGCAAGGATG GAAGGCAGAG 900 

GGTGGGTGTG GGGCCTGAGG CTTCACAGTA CCTGGACCAG CAGGAAGATT CTGGGAGGTC 960 

ACTGCTCTCA GAGGACAGCA AGGGACCCTG AGCTCTGCAA GCTGTGATCT GTCTGGGTTC 1020 

ATGGTTTTTC TCAAATCCCA GGCTATCTGC ATGCGCTCTC AGGTGCTACC GAGCCATCCT 1080 

GGGAGAGATG GATGGTCCAC TGCTTTGAGG CAGGGAGCCA TCGGGCTGGG GCCCCTTGGT 1140 

GAACCTGATG CAGGTAAGAT GCTGAGGACT AAAACCATTT TTTTTGCACC CAAAAAAAAA 1200 

GGCAGGAAAA TGATCATCAG AAACTAAATG GCAGCCAGGC ATGGGGGCTC ACGACTGTAA 1260 

TCCTCGCACT TTGGGAGGCT CAGGCTAAGG GTCGCTTGAA GCTGAGAGTT CAAGACCAAC 1320 

CTGGGCAACA TAGTGAGACC CCCATCTCTA CAATTTTTTT TTAATGACCA AATGTGGCGG 1380 

TACATACCTG TACATACCTG CGGTTCCAGC TACTCAAGAG GCTGAGGCAG GAGGACTGCT 1440 

TGAGCCCAGG AGTTCAGGGC TGCAGTGAGG TACGATCAAG CCACTGCACT CCAGCCTGGG 1500 
CGACAGAGCA AGATCGTTTC TCTAAAATT 



Seq ID NO: 122 Protein sequence: 
Protein Accession ft: NP_060776 

1 11 21 31 41 51 

I I I I I I 

METSASSSQP QDNSQVHRET EDVDYGETDF HKQDGKAGLF SQEQYERNKS SSSSFSSSSS 60 
SSSSSSSSSS GPGHGEPDVL KDELQLYGDA PGEWPSGES GLRRRGSDPA SGEVEASQLR 120 
RLNIKKDDEF FHFVLLCFAI GALLVCYHYY ADWFMSLGVG LLTFASLETV GIYFGLVYRI 180 
HSVLQGFIPL FQKFRLTGFR KTD 



Seq ID NO: 123 DNA sequence 
Nucleic Acid Accession #: BC022542 
Coding sequence: 243.-896 

I 11 21 31 41 51 

II I I I I 

ACTTGGTCCC AGCCGATAAA TCTGGGGCAG CGCGCGGTAG GAGCTGCGGG CGGCCAGGCC 60 

CCTTCCTGCG TCCGCACCTG GCCCCGCGCG CCCCTCTCGG GCGTCCGGCT TCCGGCGTCC 120 

TGGCGGCTCG GGTGGCGGCG ' GTTCGGGCGG CCGCCTGGCT GCTCCTCGGG GCGGCGACGG 180 

GGCTCACGCG CGGGCCCGCC ACGGCCTTCA CCGCCGCGCG CTCTGACGCC GGCATAAGGG 240 

CCATGTGTTC TGAAATTATT TTGAGGCAAG AAGTTTTGAA AGATGGTTTC CACAGAGACC 300 

TTTTAATCAA AGTGAAGTTT GGGGAAAGCA TTGAGGACTT GCACACGTGC CGTCTCTTAA 360 

TTAAACAGGA CATTCCTGCA GGACTTTATG TGGATCCGTA TGAGTTGGCT TCATTACGAG 420 

AGAGAAACAT AACAGAGGCA GTGATGGTTT CAGAAAATTT TGATATAGAG GCCCCTAACT 480 

ATTTGTCCAA GGAGTCTGAA GTTCTCATTT ATGCCAGACG AGATTCACAG TGCATTGACT 540 

GTTTTCAAGC CTTTTTGCCT GTGCACTGCC GCTATCATCG GCCGCACAGT GAAGATGGAG 600 

AAGCCTCGAT TGTGGTCAAT AACCCAGATT TGTTGATGTT TTGTGACCAA GAGTTCCCGA 660 

TTTTGAAATG CTGGGCTCAC TCAGAAGTGG CAGCCCCTTG TGCTTTGGAT AATGAGGATA 720 

TATGCCAATG GAACAAGATG AAGTATAAAT CAGTATATAA GAATGTGATT CTACAAGTTC 780 

CAGTGGGACT GACTGTACAT ACCTCTCTAG TATGTTCTGT GACTCTGCTC ATTACAATCC 840 

TGTGCTCTAC ATTGATCCTT GTAGCAGTTT TCAAATATGG CCATTTTTCC CTATAAGTTT 900 

TATGTAGTTA AATGCTTCCT AGAAACCTAA ATAAGATCTA TTAATTTCTG ACGAGAGGTG 960 

TTCTTCTAGA ATTAATTACT TTTATCTTTT GTCTTCATTT GTGGCCAAAA TTATGTTTAC 1020 

TAGAGGAAAT TTGGGATCAT TCTCAGCTAA TTCCAAAATG TAGTGCTCTA TTGCATGGAT 1080 

CCTTGGTAAT CCTCAAGCAT CAGATGCCAT AAGGGGAAAC TTAATTCTGC TAAATTAATG 1140 

TTTATTTTGT GAGAAGTGAC TTTATCTTCA TTTGGGGTAG AAAAATTATT TCTTTATGTA 1200 

GTAGAGACAA ATTATTCTCA TTTTGCAAGT ACTTTCAATT TAAGCTACAA ATTGAGAAAA 1260 

CCGTTATAAA TAAGAATAAA ATAGGCCAGG CACAGTGGCT CACACCTGTA ATCCCAGCAC 1320 

TTTGGGAGGC CGAGGTGGGC GGATCACCAG AGGTCAAGAG TTTGAGACCA GCTTGGTGAA 1380 

ACCCTGTCTC TACTAAAAAT ACAAAAGTTA GCTGGGGCTG GTGGTGGGCA TCTGTAGTCC 1440 

CAGCTAATTG GAAGGGTGAG GCGGGAGGAT CGCTTGAACC TGGGAGGCGG AGGTTCCAGA IS 00 

GAGCCAAGAT CGCACCACTG CACTACAGCC TGGGCGACAG AACGAGACCC TGTCTCCAAA 1560 

GGAAAAACAA AAAAGAAGAA TAAAATAATT TGGATGAAAA TCATGTTTAT TTAAATAGTA 1620 

ATGTCATGAG ACTATTAAAG ATGTGCCAGA GTTTCAATGA AAATCATTAA AGTAGGACAG 1680 

CTAAGAAATT AATATTAATA TAAAAATTAT TGATAATCTT AAATTATTGA TTATTCCTTA 1740 

ACGCACTCCA TTCTCCTTTT ACATTTTATC ATGTTTCTTT TGAATATATG AATTGGCAAA 1800 

GGACTTGATG AAACTGAGTA CTAAGATTTG GTACAGAGTA TGTCAGGAAG ACAACTCAGA I860 
TTGCCATTTT AAATAAAGTT GTACATGAAC AAAAAAAAAA AAAAAA 



Seq ID NO: 124 Protein sequence: 
Protein Accession 8: AAH22542 

1 11 21 31 41 51 

I I I 1-1 I 

MCSEIILRQE VLKDGFHRDL LIKVKFGESI EDLHTCRLLI KQDIPAGLYV DPYELASLRE 60 
RNITEAVMVS ENFDIEAPNY LSKESEVLIY ARRDSQCIDC FQAFLPVHCR YHRPHSEDGE 120 
ASIWNNPDL LMFCDQAGSR RMIRFRFDSF DKTIEFPILK CWAHSEVAAP CALENEDICQ 180 
WNKMKYKSVY KNVILQVPVG LTVHTSLVCS VTLLITILCS KKKKK 

Seq ID NO: 125 DNA sequence 

Nucleic Acid Accession #: NMJ304994.1 

Coding sequence: 20.. 2143 
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WO 02/086443 

1 11 21 31 41 51 

I I I I I I 

AGACACCTCT GCCCTCACCA TGAGCCTCTG GCAGCCCCTG GTCCTGGTGC TCCTGGTGCT 60 

GGGCTGCTGC TTTGCTGCCC CCAGACAGCG CCAGTCCACC CTTGTGCTCT TCCCTGGAGA 120 

CCTGAGAACC AATCTCACCG ACAGGCAGCT GGCAGAGGAA TACCTGTACC GCTATGGTTA 180 

CACTCGGGTG GCAGAGATGC GTGGAGAGTC GAAATCTCTG GGGCCTGCGC TGCTGCTTCT 240 

CCAGAAGCAA CTGTCCCTGC CCGAGACCGG TGAGCTGGAT AGCGCCACGC TGAAGGCCAT 300 

GCGAACCCCA CGGTGCGGGG TCCCAGACCT GGGCAGATTC CAAACCTTTG AGGGCGACCT 360 

CAAGTGGCAC CACCACAACA TCACCTATTG GATCCAAAAC TACTCGGAAG ACTTGCCGOG 420 

GGCGGTGATT GACGACGCCT TTGCCCGCGC CTTCGCACTG TGGAGCGCGG TGACGCCGCT 480 

CACCTTCACT OGOGTGTACA GCCGGGAOGC AGACATOGTC ATCCAGTTTG GTGTOGCGGA 540 

GCACGGAGAC GGGTATCCCT TCGACGGGAA GGACGGGCTC CTGGCACACG CCTTTCCTCC 600 

TGGCCCCGGC ATTCAGGGAG ACGCCCATTT CGACGATGAC GAGTTGTGGT CCCTGGGCAA 660 

GGGCGTCGTG GTTCCAACTC GGTTTGGAAA CGCAGATGGC GCGGCCTGCC ACTTCCCCTT 720 

CATCTTCGAG GGCCGCTCCT ACTCTGCCTG CACCACCGAC GGTCGCTCCG ACGGCTTGCC 780 

CTGGTGCAGT ACCACGGCCA ACTACGACAC CGACGACCGG TTTGGCTTCT GCCCCAGCGA 840 

GAGACTCTAC ACCCGGGACG GCAATGCTGA TGGGAAACCC TGCCAGTTTC CATTCATCTT 900 

CCAAGGCCAA TCCTACTCCG CCTGCACCAC GGACGGTCGC TCCGACGGCT ACCGCTGGTG 960 

CGCCACCACC GCCAACTACG ACCGGGACAA GCTCTTCGGC TTCTGCCCGA CCCGAGCTGA 1020 

CTCGACGGTG ATGGGGGGCA ACTCGGCGGG GGAGCTGTGC GTCTTCCCCT TCACTTTCCT 1080 

GGGTAAGGAG TACTCGACCT GTACCAGCGA GGGCCGCGGA GATGGGOGCC TCTGGTGCGC 1140 

TACCACCTCG AACTTTGACA GCGACAAGAA GTGGGGCTTC TGCCCGGACC AAGGATACAG 1200 

TTTGTTCCTC GTGGCGGCGC ATGAGTTCGG CCACGCGCTG GGCTTAGATC ATTCCTCAGT 1260 

GCCGGAGGCG CTCATGTACC CTATGTACCG CTTCACTGAG GGGCCCCCCT TGCATAAGGA 1320 

CGACGTGAAT GGCATCCGGC ACCTCTATGG TCCTCGCCCT GAACCTGAGC CACGGCCTCC 1380 

AACCACCACC ACACCGCAGC CCACGGCTCC CCCGACGGTC TGCCCCACCG GACCCCCCAC 1440 

TGTCCACCCC TCAGAGCGCC CCACAGCTGG CCCCACAGGT CCCCCCTCAG CTGGCCCCAC 1500 

AGGTCCCCCC ACTGCTGGCC CTTCTACGGC CACTACTGTG CCTTTGAGTC CGGTGGACGA 1560 

TGCCTGCAAC GTGAACATCT TCGACGCCAT CGCGGAGATT GGGAACCAGC TGTATTTGTT 1620 

CAAGGATGGG AAGTACTGGC GATTCTCTGA GGGCAGGGGG AGCCGGCCGC AGGGCCCCTT 1680 

CCTTATCGCC GACAAGTGGC CCGCGCTGCC CCGCAAGCTG GACTCGGTCT TTGAGGAGCC 1740 

GCTCTCCAAG AAGCTTTTCT TCTTCTCTGG GCGCCAGGTG TGGGTGTACA CAGGCGCGTC 1800 

GGTGCTGGGC CCGAGGCGTC TGGACAAGCT GGGCCTGGGA GCCGACGTGG CCCAGGTGAC 186.0 

CGGGGCCCTC CGGAGTGGCA GGGGGAAGAT GCTGCTGTTC AGCGGGCGGC GCCTCTGGAG 1920 

GTTCGACGTG AAGGCGCAGA TGGTGGATCC CCGGAGCGCC AGCGAGGTGG ACCGGATGTT 1980 

CCCCGGGGTG CCTTTGGACA CGCACGACGT CTTCCAGTAC CGAGAGAAAG CCTATTTCTG 2040 

CCAGGACCGC TTCTACTGGC GCGTGAGTTC CCGGAGTGAG TTGAACCAGG TGGACCAAGT 2100 

GGGCTACGTG ACCTATGACA TCCTGCAGTG CCCTGAGGAC TAGGGCTCCC GTCCTGCTTT 2160 

GCAGTGCCAT GTAAATCCCC ACTGGGACCA ACCCTGGGGA AGGAGCCAGT TTGCCGGATA 2220 

CAAACTGGTA TTCTGTTCTG GAGGAAAGGG AGGAGTGGAG GTGGGCTGGG CCCTCTCTTC 2280 
TCACCTTTGT TTTTTGTTGG AGTGTTTCTA ATAAACTTGG ATTCTCTAAC CTTT 



Seq ID NO: 126 Protein sequence: 
protein Accession #: NP_004985.1 

1 11 21 31 41 51 

I I I I 1.1. 

MSLWQPLVLV LLVLGCCFAA PRQRQSTLVL FPGOLRTNLT DRQLAEBYLY RYGYTRVAEM 60 

RGESKSLGPA LLLLQKQLSL PETGELDSAT LKAMRTPRCG VPDLGRFQTF EGDLKWHHHN 120 

ITYWIQNYSE DLPRAVIDDA FARAFALWSA VTPLTFTRVY SRDADIVIQF GVAEHGDGYP 180 

FDGKDGLLAK AFPPGPGIQG DAHFDDDELW SLGKGWVPT RFGNADGAAC HFPFIFEGRS 240 

YSACTTDGRS DGLPWCSTTA NYDTDDRFGF CPSERLYTRD GNADGKPCQF PFIFQGQSYS 300 

ACTTDGRSDG YRWCATTANY DRDKLFGFCP TRADSTVMGG NSAGELCVFP FTFLGKEYST 360 

CTSEGRGDGR LWCATTSNFD SDKKWGFCPD QGYSLFLVAA HEFGHALGLD HSSVPEALMY 420 

PMYRFTEGPP LHKDDVNGIR HLYGPRPEPE PRPPTTTTPQ PTAPPTVCPT GPPTVHPSER 480 

PTAGPTGPPS AGPTGPPTAG PSTATTVPLS PVDDACNVNI FDAIAEIGNQ LYLFKDGKYW 540 

RFSEGRGSRP QGPFLIADKW PALPRKLDSV FEEPLSKKLF FFSGRQVWVY TGASVLGPRR 600 

LDKLGLGADV AQVTGALRSG RGKMLLFSGR RLWRFDVKAQ MVDPRSASEV DRMFPGVPUD 660 
THDVFQYREK AYFCQDRFYW RVSSRSELNQ VDQVGYVTYD ILQCPED 

Seq ID NO: 127 DNA sequence 
Nucleic Acid Accession ft* NMJJ04181 
Coding sequence: 32-670 

1 11 21 31 41 51 

I I I I I I 

GCAGAAATAG CCTAGGGAGA TCAACCCCGA GATGCTGAAC AAAGTGCTGT CCCGGCTGGG 60 

GGTCGCCGGC CAGTGGCGCT TCGTGGACGT GCTGGGGCTG GAAGAGGAGT CTCTGGGCTC 120 

GGTGCCAGCG CCTGCCTGCG CGCTGCTGCT GCTGTTTCCC CTCACGGCCC AGCATGAGAA 180 

CTTCAGGAAA AAGCAGATTG AAGAGCTGAA GGGACAAGAA GTTAGTCCTA AAGTGTACTT 240 

CATGAAGCAG ACCATTGGGA ATTCCTGTGG CACAATCGGA CTTATTCACG CAGTGGCCAA 300 

TAATCAAGAC AAACTGGGAT TTGAGGATGG ATCAGTTCTG AAACAGTTTC TTTCTGAAAC 360 

AGAGAAAATG TCCCCTGAAG ACAGAGCAAA ATGCTTTGAA AAGAATGAGG CCATACAGGC 420 

AGCCCATGAT GCCGTGGCAC AGGAAGGCCA ATGTCGGGTA GATGACAAGG TGAATTTCCA 480 

TTTTATTCTG TTTAACAACG TGGATGGCCA CCTCTATGAA CTTGATGGAC GAATGCCTTT 540 

TCCGGTGAAC CATGGCGCCA GTTCAGAGGA CACCCTGCTG AAGGACGCTG CCAAGGTGTG 600 

CAGAGAATTC ACCGAGCGTG AGCAAGGAGA AGTCCGCTTC TCTGCCGTGG CTCTCTGCAA 660 

GGCAGCCTAA TGCTCTGTGG GAGGGACTTT GCTGATTTCC CCTCTTCCCT TCAACATGAA 720 

AATATATACC CCCCATGCAG TCTAAAATGC TTCAGTACTT GTGAAACACA GCTGTTCTTC 780 

TGTTCTGCAG ACACGCCTTC CCCTCAGCCA CACCCAGGCA CTTAAGCACA AGCAGAGTGC 840 

ACAGCTGTCC ACTGGGCCAT TGTGGTGTGA GCTTCAGATG GTGAAGCATT CTCCCCAGTG 900 

TATGTCTTGT ATCCGATATC TAACGCTTTA AATGGCTACT TTGGTTTCTG TCTGTAAGTT 960 
AAGACCTTGG ATGTGGTTAT GTTGTCCTAA AGAATAAATT TTGCTGATAG TAGC 



Seq ID NO: 128 Protein sequence: 
Protein Accession fl: NP_004172 
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WO 02/086443 „ 

1 11 21 31 41 51 

I I I I I I 

MLNKVLSRLG VAGQWRFVDV LGLEEESLGS VPAPACALLL LFPLTAQHEN FRKKQIEELK 60 
GQEVSPKVYF MKQTIGNSCG TIGLIHAVAN NQDKLGFEDG SVLKQFLSET EKMSPEDRAX 120 
CFEKNEAIQA AHDAVAQEGQ CRVDDKVNFH FILFNNVDGH LYELDGKMPP PVNHGASSED 180 
TLLKBAAKVC REFTEREQGE VRFSAVALCK AA 



Seq ID NO: 129 DNA sequence 
Nucleic Acid Accession #; NM_000213 
Coding sequence: 127- 53 85 

1 11 21 31 41 51 

I I I I I I 

CGCCCGCGCG CTGCAGCCCC ATCTCCTAGC GGCAGCCCAG GCGCGGAGGG AGCGAGTCCG 60 
CCCCGAGGTA GGTCCAGGAC GGGCGCACAG CAGCAGCCGA GGCTGGCCGG GAGAGGGAGG 120 
AAGAGGATGG CAGGGCCACG CCCCAGCCCA TGGGCCAGGC TGCTCCTGGC AGCCTTGATC 180 
AGCGTCAGCC TCTCTGGGAC CTTGGCAAAC CGCTGCAAGA AGGCCCCAGT GAAGAGCTGC 240 
ACGGAGTGTG TCOGTGTGGA TAAGGACTGC GCCTACTGCA CAGACGAGAT GTTCAGGGAC 300 
CGGCGCTGCA ACACCCAGGC GGAGCTGCTG GCCGCGGGCT GCCAGCGGGA GAGCATCGTG 360 
GTCATGGAGA GCAGCTTCCA AATCACAGAG GAGACCCAGA TTGACACCAC CCTGCGGCGC 420 
AGCCAGATGT CCCCCCAAGG CCTGCGGGTC CGTCTGCGGC CCGGTGAGGA GOGGCATTTT 480 
GAGCTGGAGG TGTTTGAGCC ACTGGAGAGC CCCGTGGACC TGTACATCCT CATGGACTTC 540 
TCCAACTCCA TGTCOGATGA TCTGGACAAC CTCAAGAAGA TGGGGCAGAA CCTGGCTCGG 600 
GTCCTGAGCC AGCTCACCAG CGACTACACT ATTGGATTTG GCAAGTTTGT GGACAAAGTC 660 
AGOGTCCCGC AGACGGACAT GAGGCCTGAG AAGCTGAAGG AGCCCTGGCC CAACAGTGAC 720 
CCCCCCTTCT CCTTCAAGAA CGTCATCAGC CTGACAGAAG ATGTGGATGA GTTCCGGAAT 780 
AAACTGCAGG GAGAGOGGAT CTCAGGCAAC CTGGATGCTC CTGAGGGCGG CTTCGATGCC 840 
ATCCTGCAGA CAGCTGTGTG CACGAGGGAC ATTGGCTGGC GCCCGGACAG CACCCACCTG 900 
CTGGTCTTCT CCACCGAGTC AGCCTTCCAC TATGAGGCTG ATGGCGCCAA CGTGCTGGCT 960 

GGCATCATGA GCCGCAACGA TGAACGGTGC CACCTGGACA CCACGGGCAC CTACACCCAG 1020 

TACAGGACAC AGGACTACCC GTCGGTGCCC ACCCTGGTGC GCCTGCTCGC CAAGCACAAC 1080 

ATCATCCCCA TCTTTGCTGT CACCAACTAC TCCTATAGCT ACTACGAGAA GCTTCACACC 1140 

TATTTCCCTG TCTCCTCACT GGGGGTGCTG CAGGAGGACT CGTCCAACAT CGTGGAGCTG 1200 

CTGGAGGAGG CCTTCAATCG GATCCGCTCC AACCTGGACA TCOGGGCCCT AGACAGCCCC 1260 

CGAGGCCTTC GGACAGAGGT CACCTCCAAG ATGTTCCAGA AGACGAGGAC TGGGTCCTTT 1320 

CACATCCGGC GGGGGGAAGT GGGTATATAC CAGGTGCAGC TGCGGGCCCT TGAGCACGTG 1380 

GATGGGACGC ACGTGTGCCA GCTGCCGGAG GACCAGAAGG GCAACATCCA TCTGAAACCT 1440 

TCCTTCTCCG ACGGCCTCAA GATGGACGCG GGCATCATCT GTGATGTGTG CACCTGCGAG 1500 

CTGCAAAAAG AGGTGCGGTC AGCTCGCTGC AGCTTCAACG GAGACTTCGT GTGCGGACAG 1560 

TGTGTGTGCA GCGAGGGCTG GAGTGGCCAG ACCTGCAACT GCTCCACCGG CTCTCTGAGT 1620 

GACATTCAGC CCTGCCTGCG GGAGGGCGAG GACAAGCCGT GCTCCGGCCG TGGGGAGTGC 1680 

CAGTGCGGGC ACTGTGTGTG CTACGGCGAA GGCCGCTACG AGGGTCAGTT CTGCGAGTAT 1740 

GACAACTTCC AGTGTCCCCG CACTTCCGGG TTCCTCTGCA ATGACCGAGG ACGCTGCTCC 1800 

ATGGGCCAGT GTGTGTGTGA GCCTGGTTGG ACAGGCCCAA GCTGTGACTG TCCCCTCAGC 1860 

AATGCCACCT GCATCGACAG CAATGGGGGC ATCTGTAATG GACGTGGCCA CTGTGAGTGT 1920 

GGCCGCTGCC ACTGCCACCA GCAGTCGCTC TACACGGACA CCATCTGCGA GATCAACTAC 1980 

TCGGCGATCC ACCOGGGCCT CTGCGAGGAC CTACGCTCCT GCGTGCAGTG CCAGGCGTGG 2040 

GGCACCGGCG AGAAGAAGGG GCGCACGTGT GAGGAATGCA ACTTCAAGGT CAAGATGGTG 2100 

GACGAGCTTA AGAGAGCCGA GGAGGTGGTG GTGCGCTGCT CCTTCCGGGA CGAGGATGAC 2160 

GACTGCACCT ACAGCTACAC CATGGAAGGT GACGGCGCCC CTGGGCCCAA CAGCACTGTC 2220 

CTGGTGCACA AGAAGAAGGA CTGCCCTCCG GGCTCCTTCT GGTGGCTCAT CCCCCTGCTC 2280 

CTCCTCCTCC TGCCGCTCCT GGCCCTGCTA CTGCTGCTAT GCTGGAAGTA CTGTGCCTGC 2340 

TGCAAGGCCT GCCTGGCACT TCTCCCGTGC TGCAACCGAG GTCACATGGT GGGCTTTAAG 2400 

GAAGACCACT ACATGCTGCG GGAGAACCTG ATGGCCTCTG ACCACTTGGA CACGCCCATG 2460 

CTGCGCAGCG GGAACCTCAA GGGCCGTGAC GTGGTCCGCT GGAAGGTCAC CAACAACATG 2520 

CAGCGGCCTG GCTTTGCCAC TCATGCCGCC AGCATCAACC CCACAGAGCT GGTGCCCTAC 2580 

GGGCTGTCCT TGGGCCTGGC CCGCCTTTGC ACCGAGAACC TGCTGAAGCC TGACACTCGG 2640 

GAGTGCGCCC AGCTGCGCCA GGAGGTGGAG GAGAACCTGA ACGAGGTCTA CAGGCAGATC 2700 

TCCGGTGTAC ACAAGCTCCA GCAGACCAAG TTCCGGCAGC AGCCCAATGC CGGGAAAAAG 2760 

CAAGACCACA CCATTGTGGA CACAGTGCTG ATGGCGCCCC GCTCGGCCAA GCCGGCCCTG 2820 

CTGAAGCTTA CAGAGAAGCA GGTGGAACAG AGGGCCTTCC ACGACCTCAA GGTGGCCCCC 2880 

GGCTACTACA CCCTCACTGC AGACCAGGAC GCCCGGGGCA TGGTGGAGTT CCAGGAGGGC 2940 

GTGGAGCTGG TGGACGTACG GGTGCCCCTC TTTATCCGGC CTGAGGATGA CGACGAGAAG 3000 

CAGCTGCTGG TGGAGGCCAT CGACGTGCCC GCAGGCACTG CCACCCTCGG CCGCCGCCTG 3060 

GTAAACATCA CCATCATCAA GGAGCAAGCC AGAGACGTGG TGTCCTTTGA GCAGCCTGAG 3120 

TTCTCGGTCA GCCGCGGGGA CCAGGTGGCC CGCATCCCTG TCATCCGGCG TGTCCTGGAC 3180 

GGCGGGAAGT CCCAGGTCTC CTACCGCACA CAGGATGGCA CCGCGCAGGG CAACCGGGAC 3240 

TACATCCCCG TGGAGGGTGA GCTGCTGTTC CAGCCTGGGG AGGCCTGGAA AGAGCTGCAG 3300 

GTGAAGCTCC TGGAGCTGCA . AGAAGTTGAC TCCCTCCTGC GGGGCCGCCA GGTCCGCCGT 3360 

TTCCACGTCC AGCTCAGCAA CCCTAAGTTT GGGGCCCACC TGGGCCAGCC CCACTCCACC 3420 

ACCATCATCA TCAGGGACCC AGATGAACTG GACCGGAGCT TCACGAGTCA GATGTTGTCA 3480 

TCACAGCCAC CCCCTCACGG CGACCTGGGC GCCCCGCAGA ACCCCAATGC TAAGGCCGCT 3540 

GGGTCCAGGA AGATCCATTT CAACTGGCTG CCCCCTTCTG GCAAGCCAAT GGGGTACAGG 3600 

GTAAAGTACT GGATTCAGGG TGACTCCGAA TCCGAAGCCC ACCTGCTCGA CAGCAAGGTG 3660 

CCCTCAGTGG AGCTCACCAA CCTGTACCCG TATTGCGACT ATGAGATGAA GGTGTGCGCC 3720 

TACGGGGCTC AGGGCGAGGG ACCCTACAGC TCCCTGGTGT CCTGCCGCAC CCACCAGGAA 3780 

GTGCCCAGCG AGCCAGGGCG TCTGGCCTTC AATGTCGTCT CCTCCACGGT GACCCAGCTG 3840 

AGCTGGGCTG AGCCGGCTGA GACCAACGGT GAGATCACAG CCTACGAGGT CTGCTATGGC 3900 

CTGGTCAACG ATGACAACCG ACCTATTGGG CCCATGAAGA AAGTGCTGGT TGACAACCCT 3960 

AAGAACCGGA TGCTGCTTAT TGAGAACCTT CGGGAGTCCC AGCCCTACCG CTACACGGTG 4020 

AAGGCGCGCA ACGGGGCCGG CTGGGGGCCT GAGCGGGAGG CCATCATCAA CCTGGCCACC 40 80 

CAGCCCAAGA GGCCCATGTC CATCCCCATC ATCCCTGACA TCCCTATCGT GGACGCCCAG 4140 

AGCGGGGAGG ACTACGACAG CTTCCTTATG TACAGCGATG ACGTTCTACG CTCTCCATCG 4200 

GGCAGCCAGA GGCCCAGCGT CTCCGATGAC ACTGAGCACC TGGTGAATGG CCGGATGGAC 4260 

TTTGCCTTCC CGGGCAGCAC CAACTCCCTG CACAGGATGA CCACGACCAG TGCTGCTGCC 4320 

TATGGCACCC ACCTGAGCCC ACACGTGCCC CACCGCGTGC TAAGCACATC CTCCACCCTC 4380 
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ACACGGGACT ACAACTCACT GACCCGCTCA GAACACTCAC ACTCGACCAC ACTGCCGAGG 4440 

GACTACTCCA CCCTCACCTC CGTCTCCTCC CACGACTCTC GCCTGACTGC TGGTGTGCCC 4500 

GACACGCCCA CCCGCCTGGT GTTCTCTGCC CTGGGGCCCA CATCTCTCAG AGTGAGCTGG 4560 

CAGGAGCCGC GGTGCGAGCG GCCGCTGCAQ GGCTACAGTG TGGAGTACCA GCTGCTGAAC 4620 

GGCGGTGAGC TGCATCGGCT CAACATCCCC AACCCTGCCC AGACCTCGGT GGTGGTGGAA 4680 

GACCTCCTGC CCAACCACTC CTAOGTGTTC CGCGTGCGGG CCCAGAGCCA GGAAGGCTGG 4740 

GGCOGAGAGC GTGAGGGTGT CATCACCATT GAATCCCAGG TGCACCCGCA GAGCCCACTG 4800 

TGTCCCCTGC CAGGCTCCGC CTTCACTTTG AGCACTCCCA GTGCCCCAGG CCCGCTGGTG 4860 

TTCACTGCCC TGAGCCCAGA CTCGCTGCAG CTGAGCTGGG AGCGGCCACG GAGGCCCAAT 4920 

GGGGATATCG TCGGCTACCT GGTGACCTGT GAGATGGCCC AAGGAGGAGG GCCAGCCACC 4980 

GCATTCCGGG TGGATGGAGA CAGCCCCGAG AGCCGGCTGA CCGTGCCGGG CCTCAGCGAG 5040 

AACGTGCCCT ACAAGTTCAA GGTGCAGGCC AGGACCACTG AGGGCTTCGG GCCAGAGOGC 5100 

GAGGGCATCA TCACCATAGA GTCCCAGGAT GGAGGACCCT TCCCGCAGCT GGGCAGCCGT 5160 

GCCGGGCTCT TCCAGCACCC GCTGCAAAGC GAGTACAGCA GCATCACCAC CACCCACACC 5220 

AGCGCCACCG AGCCCTTCCT AGTGGATGGG CCGACCCTGG GGGCCCAGCA CCTGGAGGCA 5280 

GGCGGCTCCC TCACCCGGCA TGTGACCCAG GAGTTTGTGA GCOGGACACT GACCACCAGC 5340 

GGAACCCTTA GCACCCACAT GGACCAACAG TTCTTCCAAA CTTGACCGCA CCCTGCCCCA 5400 

CCCCCGCCAT GTCCCACTAG GCGTCCTCCC GACTCCTCTC COGGAGCCTC CTCAGCTACT 5460 

CCATCCTTGC ACCCCTGGGG GCCCAGCCCA CCCGCATGCA CAGAGCAGGG GCTAGGTGTC 5520 

TCCTGGGAGG CATGAAGGGG GCAAGGTCCG TCCTCTGTGG GCCCAAACCT ATTTGTAACC 5580 

AAAGAGCTGG GAGCAGCACA AGGACCCAGC CTTTGTTCTG CACTTAATAA ATGGTTTTGC 5640 
ACTG 



Seq ID NO: 130 Protein sequence t 
Protein Accession lh NP_000204 

1 11 21 31 41 51 

111 III 

MAGPRPSPWA RLLLAALISV SLSGTLANRC KKAPVKSCTE CVRVDKDCAY CTDEMFRDRR 60 

CNTQAELLAA GOQRESIWM ESSFQITEET QIDTTLRRSQ MSPQGLRVRL RPGEERHFEL 120 

EVFEPLESPV DLYILMDFSN SMSDDLDNLK KMGQNLARVL SQLTSDYTIG FGKFVDKVSV 180 

PQTDMRPEKL KEPWPNSDPP FSFKNVISLT EDVDEFRNKL QGERISGNLD APEGGPDAIL 240 

QTAVCTRDIG WRPDSTHLLV FSTESAFHYE ADGANVLAGI MSRNDERCHL DTTGTYTQYR 300 

TQDYPSVPTL VRLLAKKNII PIFAVTNYSY SYYBKLHTYF PVSSLGVLQE DSSNIVELLE 360 

EAFNRIRSNL DIRALDSPRG I*RTEVTSKMF QKTRTGSFHI RRGEVGIYQV QLRALEHVDG 420 

THVCQLPEDQ KGNIHLKPSF SDGLKMDAGI ICDVCTCELQ KEVRSARCSF NGDFVCGQCV 480 

CSEGWSGQTC NCSTGSLSDI QPCLREGEDK PCSGRGECQC GHCVCYGEGR YEGQFCEYDN 540 

FQCPRTSGFL CNDRGRCSMG QCVCEPGWTG PSCDCPLSNA TCIDSNGGIC NGRGHCECGR 600 

CHCHQQSLYT DTICEINYSA IHPGLCEDLR SCVQCQAWGT GEKKGRTCEE CNFKVKMVDE 660 

LKRAEEVWR CSFRDEDDDC TYSYTMEGDG APGPNSTVLV HKKKDCPPGS FWWLIPLLLLi 720 

LLPLLALLLL LCWKYCACCK ACLALLPCCN RGHMVGFKED HYMLRENLMA SDHLDTPMLR 780 

SGNLKGRDW RWKVTNNMQR PGFATHAASI NPTELVPYGL SLRLARLCTB NLLKPDTREC 840 

AQLRQEVEEN LNEVYRQISG VHKLQQTKFR QQPNAGKKQD HTIVDTVLMA PRSAKPALLK 900 

LTEKQVEQRA FHDLKVAPGY YTLTADQDAR GMVEFQEGVE LVDVRVPLFI RPEDDDEKQL 960 

LVEAIDVPAG TATLGRRLVN ITIIKEQARD WSFEQPEFS VSRGDQVARI PVIRRVLDGG 1020 

KSQVSYRTQD GTAQGNRDYI PVEGELLFQP GEAWKELQVK LLELQEVDSL LRGRQVRRFH 1080 

VQLSNPKFGA HLGQPHSTTI IIRDPDELDR SFTSQMLSSQ PPPHGDLGAP QNPNAKAAGS 1140 

RKIHFNWLPP SGKPMGYRVK YWIQGDSESE AHLLDSKVPS VELTNLYPYC DYEMKVCAYG 1200 

AQGEGPYSSL VSCRTHQEVP SEPGRLAFNV VSSTVTQLSW AEPAETNGEI TAYEVCYGLV 1260 

NDDNRPIGPM KKVLVDNPKN RMLLIENLRE SQPYRYTVKA RNGAGWGPER EAIINLATQP 1320 

KRPMSIPIIP DIPIVDAQSG EDYDSFLMYS DDVLRSPSGS QRPSVSDDTB HLVNGRMDFA 1380 

FPGSTNSLHR MTTTSAAAYG THLSPHVPHR VLSTSSTLTR DYNSLTRSEH SHSTTLPRDY 1440 

STLTSVSSHD SRLTAGVPDT PTRLVFSALG PTSLRVSWQE PRCERPLQGY SVEYQLLNGG 1500 

ELHRLNIPNP AQTSVWEDL LPNHSYVFRV RAQSQEGWGR EREGVITIES QVHPQSPLCP 1560 

LPGSAFTLST PSAPGPLVFT ALSPDSLQLS WERPRRPNGD IVGYLVTCEM AOGGGPATAF 1620 

RVDGDSPESR LTVPGLSENV PYKFKVQART TEGFGPEREG IITIESQDGG PFPQLGSRAG 1680 

LFQHPLQSEY SSITTTHTSA TEPFLVDGPT LGAQHLEAGG SLTRHVTQEF VSRTLTTSGT 1740 
LSTHMDQQFF QT 

Seq ID NO: 131 DNA sequence 
Nucleic Acid Accession #: BC004372 
Coding sequence: 132.. 2231 

1 11 21 31 * . 41 51 

CCTCGTGCCG CGGACCCCAG CCTCTGCCAG GTTCGGTCCG CCATCCTCGT CCCGTCCTCC 60 

GCCGGCCCCT GCCCCGCGCC CAGGGATCCT CCAGCTCCTT TCGCCCGCGC CCTCCGTTCG 120 

CTCCGGACAC CATGGACAAG TTTTGGTGGC ACGCAGCCTG GGGACTCTGC CTCGTGCCGC 180 

TGAGCCTGGC GCAGATCGAT TTGAATATAA CCTGCCGCTT TGCAGGTGTA TTCCACGTGG 240 

AGAAAAATGG TCGCTACAGC ATCTCTCGGA CGGAGGCCGC TGACCTCTGC AAGGCTTTCA 300 

ATAGCACCTT GCCCACAATG GCCCAGATGG AGAAAGCTCT GAGCATCGGA TTTGAGACCT 360 

GCAGGTATGG GTTCATAGAA GGGCATGTGG TGATTCCCCG GATCCACCCC AACTCCATCT 420 

GTGCAGCAAA CAACACAGGG GTGTACATCC TCACATCCAA CACCTCCCAG TATGACACAT 480 

ATTGCTTCAA TGCTTCAGCT CCACCTGAAG AAGATTGTAC ATCAGTCACA GACCTGCCCA 540 

ATGCCTTTGA TGGACCAATT ACCATAACTA TTGTTAACCG TGATGGCACC CGCTATGTCC 600 

AGAAAGGAGA ATACAGAACG AATCGTGAAG ACATCTACCC CAGCAACCCT ACTGATGATG 660 

ACGTGAGCAG CGGCTCCTCC AGTGAAAGGA GCAGCACTTC AGGAGGTTAC ATCTTTTACA 720 

CCTTTTCTAC TGTACACCCC ATCCCAGACG AAGACAGTCC CTGGATCACC GACAGCACAG 780 

ACAGAATCCC TGCTACCAGT ACGTCTTCAA ATACCATCTC AGCAGGCTGG GAGCCAAATG 840 

AAGAAAATGA AGATGAAAGA GACAGACACC TCAGTTTTTC TGGATCAGGC ATTGATGATG 900 

ATGAAGATTT TATCTCCAGC ACCATTTCAA CCACACCAOG GGCTTTTGAC CACACAAAAC 960 

AGAACCAGGA CTGGACCCAG TGGAACCCAA GCCATTCAAA TCCGGAAGTG CTACTTCAGA 1020 

CAACCACAAG GATGACTGAT GTAGACAGAA ATGGCACCAC TGCTTATGAA GGAAACTGGA 1080 

ACCCAGAAGC ACACCCTCXX: CTCATTCACC ATGAGCATCA TGAGGAAGAA GAGACCCCAC 1140 

ATTCTACAAG CACAATCCAG GCAACTCCTA GTAGTACAAC GGAAGAAACA GCTACCCAGA 1200 

AGGAACAGTG GTTTGGCAAC AGATGGCATG AGGGATATCG CCAAACACCC AGAGAAGACT 1260 
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CCCATTCGAC AACAGGGACA GCTGCAGCCT CAGCTCATAC CAGCCATCCA ATGCAAGGAA 1320 

GGACAACACC AAGCCCAGAG GACAGTTCCT GGACTGATTT CTTCAACCCA ATCTCACACC 1300 

CCATGGGACG AGGTCATCAA GCAGGAAGAA GGATGGATAT GGACTCCAGT CATAGTACAA 1440 

CGCTTCAQCC TACTGCAAAT CCAAACACAG GT T TGG TGGA AGATTTGGAC AGGACAGGAC 1500 

CTCTTTCAAT GACAACGCAG CAGAGTAATT CTCAGAGCTT CTCTACATCA CATGAAGGCT 1560 

TGGAAGAAGA TAAAGACCAT CCAACAACTT CTACTCTGAC ATCAAGCAAT AGGAATGATG 1620 

TCACAGGTGG AAGAAGAGAC CCAAATCATT CTGAAGGCTC AACTACTTTA CTGGAAGGTT 1680 

ATACCTCTCA TTACCCACAC ACGAAGGAAA GCAGGACCTT CATCCCAGTG ACCTCAGCTA 1740 

AGACTGGGTC CTTTGGAGTT ACTGCAGTTA CTGTTGGAGA TTCCAACTCT AATGTCAATC 1800 

GTTCCTTATC AGGAGACCAA GACACATTCC ACCCCAGTGG GGGGTCCCAT ACCACTCATG 1860 

GATCTGAATC AGATGGACAC TCACATGGGA GTCAAGAAGG TGGAGCAAAC ACAACCTCTG 1920 

GTCCTATAAG GACACCCCAA ATTCCAGAAT GGCTGATCAT CTTGGCATCC CTCTTGGCCT 1980 

TGGCTTTGAT TCTTGCAGTT TGCATTGCAG TCAACAGTCG AAGAAGGTGT GGG CAGAAGA 2040 

AAAAGCTAGT GATCAACAGT GGCAATGGAG CTGTGGAGGA CAGAAAGCCA AGTGGACTCA 2100 

ACGGAGAGGC CAGCAAGTCT CAGGAAATGG TGCATTTGGT GAACAAGGAG TCGTCAGAAA 2160 

CTCCAGACCA GTTTATGACA GCTGATGAGA CAAGGAACCT GCAGAATGTG GACATGAAGA 2220 

TTGGGGTGTA ACACCTACAC CATTATCTTG GAAAGAAACA ACCGTTGGAA ACATAACCAT 2280 

TACAGGGAGC TGGGACACTT AACAGATGCA ATGTGCTACT GATTGTTTCA TTGCGAATCT 2340 
TTTTTAGCAT AAAATTTTCT ACTCTTAAAA AAAAAAAAAA AAAAAAA 



Seq ID NO; 132 Protein sequence: 
Protein Accession 8: AAH04372 



1 11 21 31 41 51 

I | I III 

MDKFWWHAAW GLCLVPLSLA QIDLNITCRF AGVFHVEKNG RYSISRTEAA DLCKAFNSTIi 60 

PTMAQMEKAL SIGFETCRYG FIEGHWIPR IHPNSICAAN NTGVYILTSN TSQYDTYCFN 120 

ASAPPEEDCT SVTDLPNAFD GPITITIVNR DGTRYVQKGE YRTNPEDIYP SNPTDDDVSS 180 

GSSSERSSTS GGYI FYTFST VHPIPDEDSP WITDSTDRIP ATSTSSNTIS AGWEFNEENE 240 

DERDRHLSFS GSGIDDDEDF ISSTISTTPR AFDHTKQNQD WTQWNPSHSN PEVLLQTTTR 300 

MTDVDRNGTT AYEGNWNPEA HPPLIHHEHH EEEETPHSTS TIQATPSSTT EETATQKEQW 360 

FGNRWHEGYR QTPREDSHST TGTAAASAHT SHPMQGRTTP SPEDSSWTDF FNPISHPMGR 420 

GHQAGRRMDM DSSHSTTLQP TANPNTGLVE DLDRTGPLSM TTQQSNSQSF STSHEGLEED 480 

KDHPTTSTLT SSNRNDVTGG RRDPNHSEGS TTLLEGYTSH YPHTKESRTF IPVTSAKTGS 540 

FGVTAVTVGD SNSNVNRSLS GDQDTFHPSG GSHTTHGSES DGHSHGSQEG GANTTSGPIR 600 

TPQIPEWLII LASLLALALI LAVCIAVNSR RRCGQKKKLV INSGNGAVED RKPSGLNGEA 660 
SKSQEMVHLV NKESSETPDQ FMTADETRNL QNVDMKIGV 

Seq ID NO: 133 DNA sequence 
Nucleic Acid Accession #: NM_002882 
Coding sequence; 150-755 



1 11 21 31 41 51 

I I I I I I 

CGAGGTTCGG GTOGTGGGGC GGAGGGAAGA GCGGGCGGGC GGGAGGCGCC GGCGCCAGAC 60 

GCGGAGGGAA GGAGCTACGA GTAGCCGCCG AGAGGCCGCG GAGCCAGCGA CGACCGACCC 120 

AGCCGAGCCG CCGCCGCCGC CGCGCCCCCA TGGCGGCCGC CAAGGACACT CATGAGGACC 180 

ATGATACTTC CACTGAGAAT ACAGACGAGT CCAACCATGA CCCTCAGTTT GAGCCAATAG 240 

TTTCTCTTCC TGAGCAAGAA ATTAAAACAC TGGAAGAAGA TGAAGAGGAA CTTTTTAAAA 300 

TGCGGGCAAA ACTGTTCCGA TTTGCCTCTG AGAACGATCT CCCAGAATGG AAGGAGCGAG 360 

GCACTGGTGA CGTCAAGCTC CTGAAGCACA AGGAGAAAGG GGCCATCCGC CTCCTCATGC 420 

GGAGGGACAA GACCCTGAAG ATCTGTGCCA ACCACTACAT CACGCCGATG ATGGAGCTGA 480 

AGCCCAACGC AGGTAGGGAC CGTGCCTGGG TCTGGAACAC CCACGCTGAC TTCGCCGACG 540 

AGTGCCCCAA GCCAGAGCTG CTGGCCATCC GCTTCCTGAA TGCTGAGAAT GCACAGAAAT 600 

TCAAAACAAA GTTTGAAGAA TGCAGGAAAG AGATCGAAGA GAGAGAAAAG AAAGCAGGAT 660 

CAGGCAAAAA TGATCATGCC GAAAAAGTGG CGGAAAAGCT AGAAGCTCTC TGGGTGAAGG 720 

AGGAGACCAA GGAGGATGCT GAGGAGAAGC AATAAATCGT CTTATTTTAT TTTCTTTTCC 780 

TCTCTTTCCT TTCCTTTTTT TAAAAAATTT TACCCTGCCC CTCTTTTTCG GTTTGTTTTT 840 
ATTCTTTCAT TTTTACAAGG GACGTTATAT AAAGAACTGA ACTC 

Seq ID NO: 134 Protein sequence: 
Protein Accession #s NP_002873 

1 11 21 31 41 51 

I I I I I I 

MAAAKDTHED HDTSTENTDE SNHDPQFEPI VSLPEQBIKT bEEDEEELFK MRAKLFRFAS 60 
ENDLPEMKBR GTGDVKLLKH KEKGAIRLLM RRDKTLKICA NHYITPMMEL KPNAGSDRAW 120 
VWNTHADFAD ECPKPELLAI RFLNAENAQK FKTKFEECRK EIEEREKKAG SGKNDHAEKV 180 
AEKLEALSVK EETKEDAEEK Q 

Seq ID NO: 135 DNA sequence 

Nucleic Acid Accession #: NM_000077.2 

Coding sequence: 277-742 

1 11 21 31 41 51 

I I I I I I 

CCCAACCTGG GGCGACTTCA GGTGTGCCAC ATTCGCTAAG TGCTCGGAGT TAATAGCACC 60 

TCCTCCGAGC ACTCGCTCAC GGCGTCCCCT TGOCTGGAAA GATACCGCGG TCCCTCCAGA 120 

GGATTTGAGG GACAGGGTCG GAGGGGGCTC TTCCGCCAGC ACCGGAGGAA GAAAGAGGAG 180 

GGQCTGGCTG GTCACCAGAG GGTGGGGCGG ACCGCGTGCG CTCGGCGGCT GCGGAGAGGG 240 

GGAGAGCAGG CAGCGGGCGG CGGGGAGCAG CATGGAGCCG GCGGCGGGGA GCAGCATGGA 300 

GCCTTCGGCT GACTGGCTGG CCACGGCCGC GGCCCGGGGT CGGGTAGAGG AGGTGCGGGC 360 

GCTGCTGGAG GCGGGGGCGC TGCCCAACGC ACCGAATAGT TACGGTCGGA GGCCGATCCA 420 

GGTCATGATG ATGGGCAGCG CCCGAGTGGC GGAGCTGCTG CTGCTCCACG GCGCGGAGCC 480 
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CAACTGCGCC GACCCCGCCA CTCTCACCCG ACCCGTGCAC GACGCTGCCC GGGAGGGCTT 
CCTGGACACG CTGGTGGTGC TGCACCGGGC CGGGGCGOGG CTGGACGTGC GCGATGCCTG 
GGGCCGTCTG CCCGTGGACC TGGCTGAGGA GCTGGGCCAT CGCGATGTCG CACGGTACCT 
GCGCGCGGCT GCGGGGGGCA OCAGAGGCAG TAACCATGCC CGCATAGATG CCGCGGAAGG 
TCCCTCAGAC ATCCCCGATT GAAAGAACCA GAGAGGCTCT GAGAAACCTC GGGAAACTTA 
GATCATCAGT CACCGAAGGT CCTACAGGGC CACAACTGCC CCOGCCACAA CCCACCCCGC 
TTTCGTAGTT TTCATTTAGA AAATAGAGCT TTTAAAAATG TCCTGCCTTT TAACGTAGAT 
ATATGCCTTC CCCCACTACC GTAAATGTCC ATTTATATCA TTTTTTATAT ATTCTTATAA 
AAATGTAAAA AAGAAAAACA CCGCTTCTGC CTTTTCACTG TGTTGGAGTT TTCTGGAGTG 
•AGCACTCACG CCCTAAGCGC ACATTCATGT GGGCATTTCT TGCGAGCCTC GCAGCCTCCG 
GAAGCTGTCG ACTTCATGAC AAGCATTTTG TGAACTAGGG AAGCTCAGGG GGGTTACTGG 
CTTCTCTTGA GTCACACTGC TAGCAAATGG CAGAACCAAA GCTCAAATAA AAATAAAATA 
ATTTTCATTC ATTCACTC 

Seq ID NO i 136 Protein sequence-. 
Protein Accession #: NP_000 068.1 



11 



21 



31 



41 



51 



MEPAAGSSME PSADWLATAA ARGRVEEVRA LLEAGALPNA PNSYGRRPIQ VMMMGSARVA 
ELLLLHGAEP NCADPATLTR PVHDAAREGP LDTLWLHRA GARLDVRDAW GRLPVDLAEE 
LGHRDVARYL RAAAGGTRGS NHARIDAAEG PSDIPD 



Seq ID NO: 137 DNA sequence 

Nucleic Acid Accession #: NM_058196. 

Coding sequence: 104-421 



TGTGTGGGGG 
GCCCCCACCC 
CCGAGTGGCG 
TCTCACCCGA 
GCACCGGGCC 
GGCTGAGGAG 
CAGAGGCAGT 
AAAGAACCAG 
CTACAGGGCC 
AATAGAGCTT 
TAAATGTCCA 
CGCTTCTGCC 
CATTCATGTG 
AGCATTTTGT 
AGCAAATGGC 



11 

I 

TCTGCTTGGC 
TGGCTCTGAC 
GAGCTGCTGC 
CCCGTGCACG 
GGGGCGCGGC 
CTGGGCCATC 
AACCATGCCC 
AGAGGCTCTG 
ACAACTGCCC 
TTAAAAATGT 
TTTATATCAT 
TTTTCACTGT 
GGCATTTCTT 
GAACTAGGGA 
AGAACCAAAG 



21 
I 

GGTGAGGGGG 
CATTCTGTTC 
TGCTCCACGG 
ACGCTGCCCG 
TGGACGTGCG 
GCGATGTCGC 
GCATAGATGC 
AGAAACCTCG 
CCGCCACAAC 
CCTGCCTTTT 
TTTTTATATA 
GTTGGAGTTT 
GCGAGCCTCG 
AGCTCAGGGG 
CTCAAATAAA 



31 
I 

CTCTACACAA 
TCTCTGGCAG 
CGCGGAGCCC 
GGAGGGCTTC 
CGATGCCTGG 
ACGGTACCTG 
CGCGGAAGGT 
GGAAACTTAG 
CCACCCCGCT 
AACGTAGATA 
TTCTTATAAA 
TCTGGAGTGA 
CAGCCTCCGG 
GGTTACTGGC 
AATAAAATAA 



41 

I 

GCTTCCTTTC 
GTCATGATGA 
AACTGCGCCG 
CTGGACACGC 
GGCCGTCTGC 
CGCGCGGCTG 
CCCTCAGACA 
ATCATCAGTC 
TTCGTAGTTT 
TAAGCCTTCC 
AATGTAAAAA 
GCACTCACGC 
AAGCTGTCGA 
TTCTCTTGAG 
TTTTCATTCA 



SI . 
I 

CGTCATGCCG 
TGGGCAGCGC 
ACCCCGCCAC 
TGGTGGTGCT 
CCGTGGACCT 
CGGGGGGCAC 
TCCCCGATTG 
ACCGAAGGTC 
TCATTTAGAA 
CCCACTACCG 
AGAAAAACAC 
CCTAAGCGCA 
CTTCATGACA 
TCACACTGCT 
TTCACTC 



Seq ID NO: 138 Protein sequence: 
Protein Accession #: NP_476103.1 



11 



21 



41 



51 



31 

I I I I I I 

MMMGSARVAE LLLLHGAEPN CADPATLTRP VHDAAREGPL DTLWLHRAG ARLDVRDAWG 
RLPVDLAEEL GHRDVARYLR AAAGGTRGSN HARIDAAEGP SDIPD 

Seq ID NO: 139 DNA sequence 

Nucleic Acid Accession #: NM_058197.1 

Coding sequence: 272-684 



I 

CCCAACCTGG 
TCCTCCGAGC 
GGATTTGAGG 
GGGCTGGCTG 
GGAGAGCAGG 
GCCGGCGGCG 
GGGTCGGGTA 
TAGTTACGGT 
CGGGCGACTC 
CCGGAAAAAG 
TCCTGGOGAC 
ACAGATCTCT 
TCATGATGAT 
ACTGCGCCGA 
TGGACACGCT 
GCOGTCTGCC 
GCGCGGCTGC 
CCTCAGACAT 
CATCAGTCAC 
CGTAGTTTTC 
TGCCTTCCCC 
TGTAAAAAAG 
ACTCACGCCC 
GCTGTCGACT 
CTCTTGAGTC 



11 

I 

GGCGACTTCA 
ACTCGCTCAC 
GACAGGGTCG 
GTCACCAGAG 
CAGCGGGCGG 
GGGAGCAGCA 
GAGGAGGTGC 
CGGAGGCCGA 
TGGAGGACGA 
GGGAGGCTTC 
GCCCTGGGGG 
CGAATGCTGA 
GGGCAGCGCC 
CCCCGCCACT 
GGTGGTGCTG 
CGTGGACCTG 
GGGGGGCACC 
CCCCGATTGA 
CGAAGGTCCT 
ATTTAGAAAA 
CACTACCGTA 
AAAAACACCG 
TAAGCGCACA 
TCATGACAAG 
ACACTGCTAG 



21 
1 

GGTGTGCCAC 
GGCGTCCCCT 
GAGGGGGCTC 
GGTGGGGCGG 
CGGGGAGCAG 
TGGAGCCTTC 
GGGCGCTGCT 
TCCAGGTGGG 
AGTTTGCAGG 
CTGGGGAGTT 
CTTGGGAAAC 
GAAGATCTGA 
OGAGTGGCGG 
CTCACCCGAC 
CACCGGGCCG 
GCTGAGGAGC 
AGAGGCAGTA 
AAGAACCAGA 
ACAGGGCCAC 
TAGAGCTTTT 
AATGTCCATT 
CTTCTGCCTT 
TTCATGTGGG 
CATTTTGTGA 
CAAATGGCAG 



31 
1 

ATTCGCTAAG 
TGCCTGGAAA 
TTCCGCCAGC 
ACCGCGTGCG 
CATGGAGCCG 
GGCTGACTGG 
GGAGGCGGGG 
TAGAAGGTCT 
GGAATTGGAA 
TTCAGAAGGG 
CAAGGAAGAG 
AGGGGGGAAC 
AGCTGCTGCT 
CCGTGCACGA 
GGGCGCGGCT 
TGGGCCATCG 
ACCATGCCCG 
GAGGCTCTGA 
AACTGCCCCC 
AAAAATGTCC 
TATATCATTT 
TTCACTGTGT 
CATTTCTTGC 
ACTAGGGAAG 
AACCAAAGCT 



41 
I 

TGCTCGGAGT 
GATACCGCGG 
ACCGGAGGAA 
CTCGGCGGCT 
GCGGCGGGGA 
CTGGCCACGG 
GCGCTGCCCA 
GCAGCGGGAG 
TCAGGTAGCG 
GTTTGTAATC 
GAATGAGGAG 
ATATTTGTAT 
GCTCCACGGC 
CGCTGCCCGG 
GGACGTGCGC 
CGATGTCGCA 
CATAGATGCC 
GAAACCTCGG 
GCCACAACCC 
TGCCTTTTAA 
TTTATATATT 
TGGAGTTTTC 
GAGCCTCGCA 
CTCAGGGGGG 
CAAATAAAAA 



540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
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51 
I 

TAATAGCACC 
TCCCTCCAGA 
GAAAGAGGAG 
GCGGAGAGGG 
GCAGCATGGA 
CCGCGGCCCG 
ACGCACCGAA 
CAGGGGATGG 
CTTCGATTCT 
ACAGACCTCC 
CCACGCGCGT 
TAGATGGAAG 
GCGGAGCCCA 
GAGGGCTTCC 
GATGCCTGGG 
CGGTACCTGC 
GCGGAAGGTC 
GAACTTAGAT 
ACCCCGCTTT 
CGTAGATATA 
CTTATAAAAA 
TGGAGTGAGC 
GCCTCCGGAA 
TTACTGGCTT 
TAAAATAATT 



60 
120 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
7B0 
840 



60 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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TTCATTCATT CACTC 



Seq ID NO: 140 Protein sequence: 
Protein Accession fl: NP_478104.1 

1 11 21 31 41 51 

I I I I I I 

MEPAAGSSME PAAGSSMEPS ADWLATAAAR GRVEEVRALL EAGALPNAPN SYGRRPIQVQ 60 
RRSAAGAGDG GRLWRTKFAG ELESGSASIL RKKGRLPGBP SEGVCNHRPP PGDALGAWET 120 



Seq ID NO: 141 DNA sequence 

Nucleic Acid Accession Ut NM_058195.1 

Coding sequence: 163-684 

1 11 21 31 41 51 

I I I I I I 

CCTCCCTACG GGCGCCTCCG GCAGCCCTTC CCGCGTGCGC AGGGCTCAGA GCCGTTCCGA 60 

GATCTTGGAG GTCCGGGTGG GAGTGGGGGT GGGGTGGGGG TGGGGGTGAA GGTGGGGGGC 120 

GGGCGCGCTC AGGGAAGGCG GGTGCGCGCC TGCGGGGCGG AGATGGGCAG GGGGCGGTGC 180 

GTGGGTCCCA GTCTGCAGTT AAGGGGGCAG GAGTGGCGCT GCTCACCTCT GGTGCCAAAG 240 

GGCGGCGCAG CGGCTGCCGA GCTCGGCCCT GGAGGCGGCG AGAACATGGT GCGCAGGTTC 300 

TTGGTGACCC TCCGGATTCG GCGCGCGTGC GGCCCGCCGC GAGTGAGGGT TTTCGTGGTT 360 

CACATCCCGC GGCTCACGGG GGAGTGGGCA GCGCCAGGGG GGCCCGCCGC TGTGGCCCTC 420 

GTGCTGATGC TACTGAGGAG CCAGCGTCTA GGGCAGCAGC CGCTTCCTAG AAGACCAGGT 480 

CATGATGATG GGCAGCGCCC GAGTGGCGGA GCTGCTGCTG CTCCACGGCG CGGAGCCCAA S40 

CTGCGCCGAC CCCGCCACTC TCACCCGACC CGTGCACGAC GCTGCCCGGG AGGGCTTCCT 600 

GGACACGCTG GTGGTGCTGC ACCGGGCCGG GGCGCGGCTG GACGTGCGCG ATGCCTGGGG 660 

CCGTCTGCCC GTGGACCTGG CTGAGGAGCT GGGCCATCGC GATGTCGCAC GGTACCTGCG 720 

CGOGGCTGCG GGGGGCACCA GAGGCAGTAA CCATGCCCGC ATAGATGCCG CGGAAGGTCC 780 

CTCAGACATC CCCGATTGAA AGAACCAGAG AGGCTCTGAG AAACCTCGGG AAACTT AGAT 840 

CATCAGTCAC CGAAGGTCCT ACAGGGCCAC AACTGCCCCC GCCACAACCC ACCCCGCTTT 900 

CGTAGTTTTC ATTTAGAAAA TAGAGCTTTT AAAAATGTCC TGCCTTTTAA CGTAGATATA 960 

TGCCTTCCCC CACTACCGTA AATGTCCATT TATATCATTT TTTATATATT CTTATAAAAA 1020 

TGTAAAAAAG AAAAACACCG CTTCTGCCTT TTCACTGTGT TGGAGTTTTC TGGAGTGAGC 1080 

ACTCACGCCC TAAGCGCACA TTCATGTGGG CATTTCTTGC GAGCCTCGCA GCCTCCGGAA 1140 

GCTGTCGACT TCATGACAAG CATTTTGTGA ACTAGGGAAG CTCAGGGGGG TTACTGGCTT 1200 

CTCTTGAGTC ACACTGCTAG CAAATGGCAG AACCAAAGCT CAAATAAAAA TAAAATAATT 1260 
TTCATTCATT CACTC 

Seq ID NO: 142 Protein sequence: 
Protein Accession #: NP 478102.1 



1 11 21 31 41 51 

I I I I.I i 

MGRGRCVGPS LQLRGQEWRC SPLVPKGGAA AAELGPGGGE NMVRRFLVTL RIRRACGPPR 60 
VRVFWHIPR LTGEWAAPGA PAAVALVLML LRSQRLGQQP LPRRPGHDDG QRPSGGAAAA 120 
PRRGAQLRRP RHSHPTRARR CPGGLPGHAG GAAPGRGAAG RARCLGPSAR GPG 

Seq ID NO: 143 DNA sequence 
Nucleic Acid Accession #: NM_018131 
Coding sequence: 412.. 1107 



1 11 21 31 41 51 

I I I I I I 

GAAATTGCAC ACTTAAAGAC ATCAGTGGAT GAAATCACAA GTGGGAAAGG AAAGCTGACT 60 

GATAAAGAGA GACAGAGACT TTTGGAGAAA ATTCGAGTCC TTGAGGCTGA GAAGGAGAAG 120 

AATGCTTATC AACTCACAGA GAAGGACAAA GAAATACAGC GACTGAGAGA CCAACTGAAG 180 

GCCAGATATA GTACTACCGC ATTGCTTGAA CAGCTGGAAG AGACAACGAG AGAAGGAGAA 240 

AGGAGGGAGC AGGTGTTGAA AGCCTTATCT GAAGAGAAAG ACGTATTGAA ACAACAGTTG 300 

TCTGCTGCAA CCTCACGAAT TGCTGAACTT GAAAGCAAAA CCAATACACT CCGTTTATCA 360 

CAGACTGTGG CTCCAAACTG CTTCAACTCA TCAATAAATA ATATTCATGA AATGGAAATA 420 

CAGCTGAAAG ATGCTCTGGA GAAAAATCAG CAGTGGCTCG TGTATGATCA GCAGCGGGAA 480 

GTCTATGTAA AAGGACTTTT AGCAAAGATC TTTGAGTTGG AAAAGAAAAC GGAAACAGCT 540 

GCTCATTCAC TCCCACAGCA GACAAAAAAG CCTGAATCAG AAGGTTATCT TCAAGAAGAG 600 

AAGCAGAAAT GTTACAACGA TCTCTTGGCA AGTGCAAAAA AAGATCTTGA GGTTGAACGA 660 

CAAACCATAA CTCAGCTGAG TTTTGAACTG AGTGAATTTC GAAGAAAATA TGAAGAAACC 720 

CAAAAAGAAG TTCACAATTT AAATCAGCTG TTGTATTCAC AAAGAAGGGC AGATGTGCAA 780 

CATCTGGAAG ATGATAGGCA TAAAACAGAG AAGATACAAA AACTCAGGGA AGAGAATGAT 840 

ATTGCTAGGG GAAAACTTGA AGAAGAGAAG AAGAGATCCG AAGAGCTCTT ATCTCAGGTC 900 

CAGTCTCTTT ACACATCTCT GCTAAAGCAG CAAGAAGAAC AAACAAGGGT AGCTCTGTTG 960 

GAACAACAGA TGCAGGCATG TACTTTAGAC TTTGAAAATG AAAAACTCGA CCGTCAACAT 1020 

GTGCAGCATC AATTGCATGT AATTCTTAAG GAGCTCCGAA AAGCAAGAAA AAATAACACA 1080 

GTTGGAATCC TTGAAACAGC TTCATGAGTT TGCCATCACA GAGCCATTAG TCACTTTCCA 1140 

AGGAGAGACT GAAAACAGAG AAAAAGTTGC CGCCTCACCA AAAAGTCCCA CTGCTGCACT 1200 

CAATGGAAGC CTGGTGGAAT GTCCCAAGTG CAATATACAG TATCCAGCCA CTGAGCATCG 1260 

CGATCTGCTT GTCCATGTGG AATACTGTTC AAAGTAGCAA AATAAGTATT TGTTTTGATA 1320 

TTAAAAGATT CAATACTGTA TTTTCTGTTA GCTTGTGGGC ATTTTGAATT ATATATTTCA 1380 

CATTTTGCAT AAAACTGCCT ATCTACCTTT GACACTCCAG CATGCTAGTG AATCATGTAT 1440 

CTTTTAGGCT GCTGTGCATT TCTCTTGGCA GTGATACCTC CCTGACATGG TTCATCATCA 1500 

GGCTGCAATG ACAGAATGTG GTGAGCAGCG TCTACTGAGA TACTAACATT TTGCACTGTC 1560 

AAAATACTTG GTGAGGAAAA GATAGCTCAG GTTATTGCTA ATGGGTTAAT GCACCAGCAA 1620 

GCAAAATATT TTATGTTTCG GGGGTTTTGA AAAATCAAAG ATAATTAACC AAGGATCTTA 1680 

ACTGTGTTCG CATTTTTTAT CCAAGCACTT AGAAAACCTA CAATCCTAAT TTTGATGTCC 1740 

ATTGTTAAGA GGTGGTGATA GATACTATTT TTTTTTCATA TTGTATAGCG GTTATTAGAA 1800 
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AAGTTGGGGA TTTTCTTGAT CTTTATTGCT GCTTACCATT GAAACTTAAC CCAGCTGTGT 1860 

TCCCCAACTC TGTTCTGCGC ACGAAACAGT ATCTGTTTGA GGCATAATCT TAAGTGGCCA 1920 

CACACAATGT TTTCTCTTAT GTTATCTGGC AGTAACTGTA ACTTGAATTA CATTAGCACA 1930 

TTCTGCTTAG CTAAAATTGT TAAAATAAAC TTTAATAAAC CCATGTAGCC CTCTCATTTG 2040 

ATTGACAGTA TTTTAGTTAT TTTTGGCATT CTTAAAGCTG GGCAATGTAA TGATCAGATC 2100 

TTTGTTTGTC TGAACAGGTA TTTTTATACA TGCTTTTTGT AAACCAAAAA CTTTTAAATT 2160 

TCTTCAGGTT TTCTAACATG CTTACCACTG GGCTACTGTA AATGAGAAAA GAATAAAATT 2220 
ATTTAATGTT TT 



Seq ID NO: 144 Protein sequence: 
Protein Accession #: NP_060601 

1 11 21 31 41 51 

I I I I I 1 

MEIQLKDALE KNQQWLVYDQ QREVYVKGLL AKIFELEKKT ETAAHSLPQQ TKKPESEGYI* 60 

QEEKQKCYND LLASAKKDLE VERQTITQLS FELSEFRRKY EETQKEVHNL NQLLYSQRRA 120 

DVQHLEDDRH KTEKIQKLRE ENDIARGKLE EEKKRSEELL SQVQSLYTSL LKQQEEQTRV 180 
ALLEQQMQAC TLDFENEKLD RQHVQHQLHV ILKELRKARK NNTVGILETA S 

Seq ID NO: 145 DNA sequence 
Nucleic Acid Accession ih NM_001168 
Coding sequence: 50.. 478 

1 11 21 31 41 51 

I I I I I I 

CCGCCAGATT TGAATCGCGG GACCCGTTGG CAGAGGTGGC GGCGGCGGCA TGGGTGCCCC 60 

GACGTTGCCC CCTGCCTGGC AGCCCTTTCT CAAGGACCAC CGCATCTCTA CATTCAAGAA 120 

CTGGCCCTTC TTGGAGGGCT GCGCCTGCAC CCCGGAGCGG ATGGCCGAGG CTGGCTTCAT 180 

CCACTGCCCC ACTGAGAACG AGCCAGACTT GGCCCAGTGT TTCTTCTGCT TCAAGGAGCT 240 

GGAAGGCTGG GAGCCAGATG ACGACCCCAT AGAGGAACAT AAAAAGCATT CGTCCGGTTG 300 

CGCTTTCCTT TCTGTCAAGA AGCAGTTTGA AGAATTAACC CTTGGTGAAT TTTTGAAACT 360 

GGACAGAGAA AGAGCCAAGA ACAAAATTGC AAAGGAAACC AACAATAAGA AGAAAGAATT 420 

TGAGGAAACT GCGAAGAAAG TGCGCCGTGC CATCGAGCAG CTGGCTGCCA TGGATTGAGG 480 

CCTCTGGCCG GAGCTGCCTG GTCCCAGAGT GGCTGCACCA CTTCCAGGGT TTATTCCCTG 540 

GTGCCACCAG CCTTCCTGTG GGCCCCTTAG CAATGTCTTA GGAAAGGAGA TCAACATTTT 600 

CAAATTAGAT GTTTCAACTG TGCTCCTGTT TTGTCTTGAA AGTGGCACCA GAGGTGCTTC 660 

TGCCTGTGCA GCGGGTGCTG CTGGTAACAG TGGCTGCTTC TCTCTCTCTC TCTCTTTTTT 720 

GGGGGCTCAT TTTTGCTGTT TTGATTCCCG GGCTTACCAG GTGAGAAGTG AGGGAGGAAG 780 

AAGGCAGTGT CCCTTTTGCT AGAGCTGACA GCTTTGTTCG CGTGGGCAGA GCCTTCCACA 840 

GTGAATGTGT CTGGACCTCA TGTTGTTGAG GCTGTCACAG TCCTGAGTGT GGACTTGGCA 900 

GGTGCCTGTT GAATCTGAGC- TGCAGGTTCC TTATCTGTCA CACCTGTGCC TCCTCAGAGG 960 

ACAGTTTTTT TGTTGTTGTG TTTTTTTGTT TTTTTTTTTT GGTAGATGCA TGACTTGTGT 1020 

GTGATGAGAG AATGGAGACA GAGTCCCTGG CTCCTCTACT GTTTAACAAC ATGGCTTTCT 1080 

TATTTTGTTT GAATTGTTAA TTCACAGAAT AGCACAAACT ACAATTAAAA CTAAGCACAA 1140 

AGCCATTCTA AGTCATTGGG GAAACGGGGT GAACTTCAGG TGGATGAGGA GACAGAATAG 1200 

AGTGATAGGA AGCGTCTGGC AGATACTCCT TTTGCCACTG CTGTGTGATT AGACAGGCCC 1260 

AGTGAGCCGC GGGGCACATG CTGGCCGCTC CTCCCTCAGA AAAAGGCAGT GGCCTAAATC 1320 

CTTTTTAAAT GACTTGGCTC GATGCTGTGG GGGACTGGCT GGGCTGCTGC AGGCCGTGTG 1380 

TCTGTCAGCC CAACCTTCAC ATCTGTCACG TTCTCCACAC GGGGGAGAGA CGCAGTCCGC 1440 

CCAGGTCCCC GCTTTCTTTG GAGGCAGCAG CTCCCGCAGG GCTGAAGTCT GGCGTAAGAT 1500 

GATGGATTTG ATTCGCCCTC CTCCCTGTCA TAGAGCTGCA GGGTGGATTG TTACAGCTTC 1560 
GCTGGAAACC TCTGGAGGTC ATCTCGGCTG TTCCTGAGAA ATAAAAAGCC TGTCATTTC 

Seq ID NO: 146 Protein sequence: 
Protein Accession ft: NP_001159 

1 11 21 31 41 51 

I I I I I I 

MGAPTLPPAW QPFLKDHRIS TFKNWPFLEG CACTPERMAE AGFIHCPTEN EPDLAQCFFC 60 

FKELEGWEPD DDPIEEHKKH SSGCAFLSVK KQFEELTLGE FLKLDRERAK NKIAKETNNK 120 
KKEFEETAKK VRRAIEQLAA MD 

Seq ID NO: 147 DNA sequence 

Nucleic Acid Accession ft: NM_014176.1 

Coding sequence: 127-720 

1 11 21 31 41 51 • 

I I I I I I 

GCGCGCAGCG CTGGTACCCC GTTGGTCCGC GCGTTGCTGC GTTGTGAGGG GTGTCAGCTC 60 

AGTGCATCCC AGGCAGCTCT TAGTGTGGAG CAGTGAACTG TGTGTGGTTC CTTCTACTTG 120 

GGGATCATGC AGAGAGCTTC ACGTCTGAAG AGAGAGCTGC ACATGTTAGC CACAGAGCCA 180 

CCCCCAGGCA TCACATGTTG GCAAGATAAA GACCAAATGG ATGACCTGCG AGCTCAAATA 240 

TTAGGTGGAG CCAACACACC TTATGAGAAA GGTGTTTTTA AGCTAGAAGT TATCATTCCT 300 

GAGAGGTACC CATTTGAACC TCCTCAGATC CGATTTCTCA CTCCAATTTA TCATCCAAAC 360 

ATTGATTCTG CTGGAAGGAT TTGTCTGGAT GTTCTCAAAT TGCCACCAAA AGGTGCTTGG 420 

AGACCATCCC TCAACATCGC AACTGTGTTG ACCTCTATTC AGCTGCTCAT GTCAGAACCC 480 

• AACCCTGATG ACCCGCTCAT GGCTGACATA TCCTCAGAAT TTAAATATAA TAAGCCAGCC 540 

TTCCTCAAGA ATGCCAGACA GTGGACAGAG AAGCATGCAA GACAGAAACA AAAGGCTGAT 600 

GAGGAAGAGA TGCTTGATAA TCTACCAGAG GCTGGTGACT CCAGAGTACA CAACTCAACA 660 

CAGAAAAGGA AGGCCAGTCA GCTAGTAGGC ATAGAAAAGA AATTTCATCC TGATGTTTAG 720 

GGGACTTGTC CTGGTTCATC TTAGTTAATG TGTTCTTTGC CAAGGTGATC TAAGTTGCCT 7 B0 

ACCTTGAATT. TTTTTTTAAA TATATTTGAT GACATAATTT TTGTGTAGTT TATTTATCTT 840 

GTACATATGT ATTTTGAAAT CTTTTAAACC TGAAAAATAA ATAGTCATTT AATGTTGAAA 900 
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AAAAAAAAAA AAAAAAAAAA AAAAAAAA 



Seq ID NO j 148 Protein sequence: 
Protein Accession 8: NP_054895.1 

1 11 21 31 41 51 

I I I I 1 I 

MQRASRLKRE LHMLATEPPP GXTCWQDKDQ MDDLRAQILG GANTPYEKGV FKLEVIIPER 60 

YPPEPPQIRF LTPIYHPNID SAGRICLDVL KLPPKGAWRP SLNIATVLTS IQLLMSEPNP 120 

DDPLMADISS EFKYNKPAFL KNARQWTEKH ARQKQKADEE EMLDNLPEAG DSRVHNSTQK 180 
RKASQLVGIE KKFHPDV 

Seq ID NO: 149 DNA sequence 
Nucleic Acid Accession ft: NM_003812 
Coding sequence: 224-2722 

1 11 21 31 41 51 

1 I I I I I 

TCCTCTGCGT CCCGCCCCGG GAGTGGCTGC GAGGCTAGGC GAGCCGGGAA AGGGGGCGCC 60 

GCCCAGCCCC GAGCCCCGCG CCCCGTGCCC CGAGCCCGGA GCCCCCTGCC CGGGGCGGCA 120 

CCATGCGCGC CGAGCCGGCG TGACCGGCTC CGCCCGCGGC CGCCCCGCAG CTAGCCCGGC 180 

GCTCTCGCCG GCCACACGGA GCGGCGCCCG GGAGCTATGA GCCATGAAGC CGCCCGGCAG 240 

CAGCTCGCGG CAGCCGCCCC TGGCGGGCTG CAGCCTTGCC GGCGCTTCCT GCGGCCCCCA 300 

AOGCGGCCCC GCCGGCTCGG TGCCTGCCAG CGCCCCGGCC CGCACGCCGC CCTGCOGCCT 360 

GCTTCTCGTC CTTCTCCTGC TGCCTCCGCT CGCCGCCTCG TCCCGGCCCC GOGCCTGGGG 420 

GGCTGCTGCG CCCAGCGCTC CGCATTGGAA TGAAACTGCA GAAAAAAATT TGGGAGTCCT 480 

GGCAGATGAA GACAATACAT TGCAACAGAA TAGCAGCAGT AATATCAGTT ACAGCAATGC 540 

AATGCAGAAA GAAATCACAC TGCCTTCAAG ACTCATATAT TACATCAACC AAGACTCGGA 600 

AAGCCCTTAT CACGTTCTTG ACACAAAGGC AAGACACCAG CAAAAACATA ATAAGGCTGT 660 

CCATCTGGCC CAGGCAAGCT TCCAGATTGA AGCCTTCGGC TCCAAATTCA TTCTTGACCT 720 

CATACTGAAC AATGGTTTGT TGTCTTCTGA TTATGTGGAG ATTCACTACG AAAATGGGAA 780 

ACCACAGTAC TCTAAGGGTG GAGAGCACTG TTACTACCAT GGAAGCATCA GAGGCGTCAA 840 

AGACTCCAAG GTGGCTCTGT CAACCTGCAA TGGACTTCAT GGCATGTTTG AAGATGATAC 900 

CTTCGTGTAT ATGATAGAGC CACTAGAGCT GGTTCATGAT GAGAAAAGCA CAGGTCGACC 960 

ACATATAATC CAGAAAACCT TGGCAGGACA GTATTCTAAG CAAATGAAGA ATCTCACTAT 1020 

GGAAAGAGGT GACCAGTGGC CCTTTCTCTC TGAATTACAG TGGTTGAAAA GAAGGAAGAG 1080 

AGCAGTGAAT CCATCACGTG GTATATTTGA AGAAATGAAA TATTTGGAAC TTATGATTGT 1140 

TAATGATCAC AAAACGTATA AGAAGCATCG CTCTTCTCAT GCACATACCA ACAACTTTGC 1200 

AAAGTCCGTG GTCAACCTTG TGGATTCTAT TTACAAGGAG CAGCTCAACA CCAGGGTTGT 1260 

CCTGGTGGCT GTAGAGACCT GGACTGAGAA GGATCAGATT GACATCACCA CCAACCCTGT 1320 

GCAGATGCTC CATGAGTTCT CAAAATACCG GCAGCGCATT AAGCAGCATG CTGATGCTGT 1380 

GCACCTCATC TCGCGGGTGA CATTTCACTA TAAGAGAAGC AGTCTGAGTT ACTTTGGAGG 1440 

TGTCTGTTCT CGCACAAGAG GAGTTGGTGT GAATGAGTAT GGTCTTCCAA TGGCAGTGGC 1500 

ACAAGTATTA TCGCAGAGCC TGGCTCAAAA CCTTGGAATC CAATGGGAAC CTTCTAGCAG 1560 

AAAGCCAAAA TGTGACTGCA CAGAATCCTG GGGTGGCTGC ATCATGGAGG AAACAGGGGT 1620 

GTCCCATTCT CGAAAATTTT CAAAGTGCAG CATTTTGGAG TATAGAGACT TTTTACAGAG 1680 

AGGAGGTGGA GCCTGCCTTT TCAACAGGCC AACAAAGCTA TTTGAGCCCA CGGAATGTGG 1740 

AAATGGATAC GTGQAAGCTG GGGAGGAGTG TGATTGTGGT TTTCATGTGG AATGCTATGG 1800 

ATTATGCTGT AAGAAATGTT CCCTCTCCAA CGGGGCTCAC TGCAGCGACG GGCCCTGCTG I860 

TAACAATACC TCATGTCTTT TTCAGCCACG AGGGTATGAA TGCCGGGATG CTGTGAACGA 1920 

GTGTGATATT ACTGAATATT GTACTGGAGA CTCTGGTCAG TGCCCACCAA ATCTTCATAA 1980 

GCAAGACGGA TATGCATGCA ATCAAAATCA GGGCCGCTGC TACAATGGCG AGTGCAAGAC 2040 

CAGAGACAAC CAGTGTCAGT ACATCTGGGG AACAAAGGCT GCAGGGTCTG ACAAGTTCTG 2100 

CTATGAAAAG CTGAATACAG AAGGCACTGA GAAGGGAAAC TGCGGGAAGG ATGGAGACCG 2160 

GTGGATTCAG TGCAGCAAAC ATGATGTGTT CTGTGGATTC TTACTCTGTA CCAATCTTAC 2220 

TCGAGCTCCA CGTATTGGTC AACTTCAGGG TGAGATCATT CCAACTTCCT TCTACCATCA 2280 

AGGCCGGGTG ATTGACTGCA GTGGTGCCCA TGTAGTTTTA GATGATGATA CGGATGTGGG 2340 

CTATGTAGAA GATGGAACGC CATGTGGCCC GTCTATGATG TGTTTAGATC GGAAGTGCCT 2400 

ACAAATTCAA GCCCTAAATA TGAGCAGCTG TCCACTCGAT TCCAAGGGTA AAGTCTGTTC 2460 

GGGCCATGGG GTGTGTAGTA ATGAAGCCAC CTGCATTTGT GATTTCACCT GGGCAGGGAC 2520 

AGATTGCAGT ATCCGGGATC CAGTTAGGAA CCTTCACCCC CCCAAGGATG AAGGACCCAA 2580 

GGGTCCTAGT GCCACCAATC TCATAATAGG CTCCATCGCT GGTGCCATCC TGGTAGCAGC 2640 

TATTGTCCTT GGGGGCACAG GCTGGGGATT TAAAAATGTC AAGAAGAGAA GGTTCGATCC 2700 

TACTCAGCAA GGCCCCATCT GAATCAGCTG CGCTGGATGG ACACCGCCTT GCACTGTTGG 2760 

ATTCTGGGTA TGACATACTC GCAGCAGTGT TACTGGAACT ATTAAGTTTG TAAACAAAAC 2820 

CTTTGGGTGG TAATGACTAC GGAGCTAAAG TTGGGGTGAC AAGGATGGGG TAAAAGAAAA 2880 

CTGTCTCTTT TGGAAATAAT GTCAAAGAAC ACCTTTCACC ACCTGTCAGT AAACGGGGGA 2940 

GGGGGCAAAA GACCATGCTA TAAAAAGAAC TGTTCCAGAA TCTTTTTTTT TCCCTAATGG 3000 
ACGAAGGAAC AACACACACA CAAAAATTAA ATGCAATAAA GGAATCATTA AAAA 



Seq ID NO: 150 Protein sequence: 
Protein Accession #: NP_003803 

1 11 21 31 41 51 

I I I I I I 

MKPPGSSSRQ PPLAGCSIjAG ASCGPQRGPA GSVPASAPAR TPPCRLLLVL LLLPPLAASS 60 

RPRAWGAAAP SAPKWNETAE KNLGVLADED HTLQQMSSSN ISYSNAMQKE ITLPSRLIYY 120 

INQDSESPYH VLDTKARHQQ KHNKAVHLAQ ASFQIEAFGS KFILDLILNN GLLSSDYVEI 180 

HYENGKPQYS KGGEHCYYHG SIRGVKDSKV ALSTCNGLHG MFEDDTPVYM IEPLELVHDE 240 

KSTGRPHIIQ KTLAGQYSKQ MKNLTMERGD QWPPLSELQW LKRRKRAVNP SRGIFEEMKY 300 

LELMIVNDKK TYKKHRSSHA HTNNFAKSW NLVDSIYKEQ LNTRWLVAV ETWTEKDQID 360 

ITTNPVQMLH EPSKYRQRIK QKADAVHLIS RVTPHYKRSS LSYFGGVCSR TRGVGVNEYG 420 

LPMAVAQVLS QSLAQNLGIQ WEPSSRKPKC DCTESWGGCI MEETGVSHSR KFSKCSILEY 480 

RDFLQRGGGA CLFNRPTKLF EPTECGNGYV EAGEECDCGF HVECYGLCCK KCSLSNGAHC 540 

SDGPCCNNTS CLPQPRGYEC RDAVNECDIT EYCTGDSGQC PPNLHKQDGY ACNQNQGRCY 600 

NGECKTRDNQ CQYIWGTKAA GSDKFCYEKL NTEGTEKGNC GKDGDRHIQC SKHDVFCGFL 660 
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LCTNLTRAPR IGQLQGEIIP TSFYHQGRVI DCSGAHWLD DDTDVGYVED GTPCGPSMMC 720 

LDRKCLQIQA LNNSSCPLDS KGKVCSGHGV CSNEATCICD FTWAGTDCSI RDPVRNIiHPP 780 
KDEGPKGPSA TNLIIGSIAG AILVAATVIjG GTGWGFKNVK KRRFDPTQQG PI 

Seg ID NO: 151 DMA sequence 
Nucleic Acid Accession 8: NM_023915 
Coding sequence: 250-1326 

1 11 21 31 41 51 

I I I I I I 

GGCACGAGGG TTTCGTTTTC ATGCTTTACC AGAAAATCCA CTTCCCTGCC GACCTTAGTT 60 

TCAAAGCTTA TTCTTAATTA GAGACAAGAA ACCTGTTTCA ACTTGAAGAC ACCGTATGAG 120 

GTGAATGGAC AGCCAGCCAC CACAATGAAA GAAATCAAAC CAGGAATAAC CTATGCTGAA 180 

CCCACGCCTC AATCGTCCCC AAGTGTTTCC TGACACGCAT CTTTGCTTAC AGTGCATCAC 240 

AACTGAAGAA TGGGGTTCAA CTTGAOGCTT GCAAAATTAC CAAATAACGA GCTGCACGGC 300 

CAAGAGAGTC ACAATTCAGG CAACAGGAGC GACGGGCCAG GAAAGAACAC CACCCTTCAC 360 

AATGAATTTG ACACAATTGT CTTGCCGGTG CTTTATCTCA TTATATTTGT GGCAAGCATC 420 

TTGCTGAATG GTTTAGCAGT GTGGATCTTC TTCCACATTA GGAATAAAAC CAGCTTCATA 4 B0 

TTCTATCTCA AAAACATAGT GGTTGCAGAC CTCATAATGA CGCTGACATT TCCATTTCGA 540 

ATAGTCCATG ATGCAGGATT TGGACCTTGG TACTTCAAGT TTATTCTCTG CAGATACACT 600 

TCAGTTTTGT TTTATGCAAA CATGTATACT TCCATCGTGT TCCTTGGGCT GATAAGCATT 660 

GATCGCTATC TGAAGGTGGT CAAGCCATTT GGGGACTCTC GGATGTACAG CATAACCTTC 720 

ACGAAGGTTT TATCTGTTTG TGTTTGGGTG ATCATGGCTG TTTTGTCTTT GCCAAACATC 780 

ATCCTGACAA ATGGTCAGCC AACAGAGGAC AATATCCATG ACTGCTCAAA ACTTAAAAGT 840 

CCTTTGGGGG TCAAATGGCA TACGGCAGTC ACCTATGTGA ACAGCTGCTT GTTTGTGGCC 900 

GTGCTGGTGA TTCTGATCGG ATGTTACATA GCCATATCCA GGTACATCCA CAAATCCAGC 960 

AGGCAATTCA TAAGTCAGTC AAGCCGAAAG CGAAAACATA ACCAGAGCAT CAGGGTTGTT 1020 

GTGGCTGTGT TTTTTACCTG CTTTCTACCA TATCACTTGT GCAGAATTCC TTTTACTTTT 1080 

AGTCACTTAG ACAGGCTTTT AGATGAATCT GCACAAAAAA TCCTATATTA CTGCAAAGAA 1140 

ATTACACTTT TCTTGTCTGC GTGTAATGTT TGCCTGGATC CAATAATTTA CTTTTTCATG 1200 

TGTAGGTCAT TTTCAAGAAG GCTGTTCAAA AAATCAAATA TCAGAACCAG GAGTGAAAGC 1260 

ATCAGATCAC TGCAAAGTGT GAGAAGATCG GAAGTTCGCA TATATTATGA TTACACTGAT 1320 

GTGTAGGCCT TTTATTGTTT GTTGGAATCG ATATGTACAA AGTGTAAATA AATGTTTCTT 1380 
TTCATTATCC TTAAAAAAAA AA 

Seq ID NO: 152 Protein sequence: 
Protein Accession #: NP_076404 

1 11 21 31 41 51 

I I I I I I 

MGFNLTLAKL PNNELHGQES HNSGNRSDGP GKNTTLHNEF DTIVLPVLYL IIFVASIliLN 60 

GLAVWIFFHI RNKTSFIFYL KNIWADLIM TLTFPFRIVH DAGFGPWYFK FILCRYTSVL 120 

FYANMYTSIV FLGLISIDRY LKWKPFGDS RMYSITFTKV LSVCVWVIMA VLSLPNIII/T 180 

NGQPTEDNIH DCSKLKSPLG VKWHTAVTYV NSCLFVAVLV ILIGCYIAIS RYIHKSSRQF 240 

ISQSSRKRKH NQSIRWVAV FFTCFLPYHL CRIPFTFSHL DRLLDESAQK ILiYYCKEITL 300 
FLSACNVCLD PIIYFFMCRS FSRRLFKKSN IRTRSESIRS LQSVRRSEVR IYYDYTDV 

Seq ID NO: 153 DNA sequence 
Nucleic Acid Accession #: D80008.1 
Coding sequence: 149-739 

1 11 21 31 41 51 

11)111 

GTTCGGCGCC AAAGCGCGGA GCGGAGGCCG AGGCGAGAGC CTGGCGCTGT AGGACTAGAA 60 

CGAAAGGAGT GAGGCGCCGA GAGCCCAGAT ACCATTTTGG CGTGAGAGCT GGTGGTTGGC 120 

AAGGCCGCGG GAGTGGGAAG CGTCCGCCAT GTTCTGCGAA AAAGCCATGG AACTGATCCG 180 

CGAGCTGCAT CGCGCGCCCG AAGGGCAACT GCCTGCCTTC AACGAGGATG GACTCAGACA 240 

AGTTCTGGAG GAGATGAAAG CTTTGTATGA ACAAAACCAG TCTGATGTGA ATGAAGCAAA 300 

GTCAGGTGGA CGAAGTGATT TGATACCAAC TATCAAATTT CGACACTGTT CTCTGTTAAG 360 

AAATCGACGC TGCACTGTAG CATACCTGTA TGACCGCTTG CTTCGGATCA GAGCACTCAG 420 

ATGGGAATAT GGTAGCGTCT TGCCAAATGC ATTACGATTT CACATGGCTG CTGAAGAAAT 480 

GGAGTGGTTT AATAATTATA AAAGATCTCT TGCTACTTAT ATGAGGTCAC TGGGAGGAGA 540 

TGAAGGTTTG GACATTACAC AGGATATGAA ACCACCAAAA AGCCTATATA TTGAAGTCCG 600 

GTGTCTAAAA GACTATGGAG AATTTGAAGT TGATGATGGC ACTTCAGTCC TATTAAAAAA 660 

AAATAGCCAG CACTTTTTAC CTCGATGGAA ATGTGAGCAG CTGATCAGAC AAGGAGTCCT 720 

GGAGCACATC CTGTCATGAC CATGCGCCGA GGCACTTCCA GGCTTCACTC AACTCATGGA 780 

CTCCTCTGTA CTCACTCTCT CCACCACTCC CTTCACCTCC CTCTTTGATT TTAGAAGCTA 840 

TAGACATTGT TTAAGATAAC TAAGAATACT TGGCTAAGAA GTATAATTTG CTAACTATTA 900 

AGGACTTTCT TTTTTTAATG TTGTACACTA TTCTTCCTAC TCTTTTTTGG TTTTGGTTTT 960 

GTTTTGTAGA GACTGTCTCA CTATGTTGCC CAAGCTGGTC TCAAACTCCT GGCCTCAAGC 1020 

AGTCCTCCCA CCTTAGCTTC TCAAAGTGTT GAGATCACAG GCGTGAGCCA CTGCACCCGG 1080 

CCCCTACTCC TTTTTCTAAT AAGCTGTATC TGTAATCACA GCATTCCTAC AGTTGTTACA 1140 

GTGTGTTTTT TAAATGAAAG TAAACATGGT TACATTTGAA TCTCTTAAAT AAGCAGTCAC 1200 

TTGGCTGGAC AGGAAGAAGG TAGATCCTGT GTGTCTTGTT TTCTGGTCAT GTGTATTGTA 1260 

CAAGCTAGAG AGCTGAATTT CTGAGATACA CATTTTCAAA TCACATGCAA GTGAAGATGA 1320 

TGGTCTGTAG AAATTTTCAG TATATATAAT GTTTAATGAC ATACTAATTT ATCATCTGGC 1380 

TATTTGGGAA GGAAGGACAC ACATGGATTT TGCACATTTC CACCATGGTG GCTGGTGTGG 1440 

CTTGTGGCTA TGGGGTGATC ACCAGTATCA CCACTTTGGA AGGGGACAGT GAAATTGGGG 1500 

CTAGAGAAGG AACTTTGTAC AGTTTTCCCT GAGATTCAGA TTGACTGAAA AGTCACATGA 1560 

AGAGTTGATT GTCTTTTAAT GGTATGTTTT AAACAGCTGA CATTTTAAAT TTTGATGAAA 1620 

TCCAGTTTAT TCGTTTGTTC TTTTATGCTT TGGGTGTTGC ATCCGAGAAA TCTTTTCCCA 1680 

TCCCAAGATC ACAATTTTTT TTCCTTTTTA CTTCTAGAAG TGTTATAATT TTAAGCTTTA 1740 

TACTTTGGTC TATGACCCGT TTTTTTTTTT GTTTTGTTTT GTTTTTTCGT TTGTTTCTTT 1800 

GTTTTGAGAT GGAGTCTTGT TCTGTCACCC AGGCTGGGGT GCAGTGGCGT GATCTTGGCT I860 

CACTGCAATC TCTATCCCCT GGGTTCAAGT GATTCTCTTG TCTCAGCCTC CCAAGTAGCT 1920 

GGGATTACAG GCACAGGCCG CCACGCCTGG CTAATTTTTG TATTTTTAGT AGAGACAGAG 1980 
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TTTTACCATG TTGGCCAGGC TGGTCTCAAA 
CCCAAAGTTT TGGGATTACA AGTGTGGGCC 
GAATTTTTTA TATGGTGCAA GGTGTCAATC 
TCCAGCTQTT TCACTACCAT TTTTTGAAAG 
TTTGTTAAAA AGTAGTTGTC AATGTATATG 
ATTGACCTGT TTTTCTCTCC TGAATGCCAA 
TCTAATAATT CTTGAAACAG ATAGTATTAA 
TTTGTAGAGA TGGGGTTTCA CCGTGTTGGC 
ATACACTTGC CTCGTCCTCC CCATGTGCTG 
CAGTGTACCA CATTTCTTTT TGAGATTTGT 
GTGAAATTTG GGAACAGGCA GGGTGTGGTG 
GGCCTAGATG GGTGGATCAC TTGAGCTCAG 
AACTCCGTCT CTACAAAAAA TAGAAAAAAT 
CACAGTTACA CGGCAGGCTG AGGTGGGAGG 
GTGAGCTGAG ATCACACCAC TGTACTCCAG 
AAAGAAATTA GGATCAATTT GTCAATTTCT 
CACCTTGATT GAGATTGCAT TGAATTTATA 
AATATTGAGT CTTCTGGCCT ATAAACAAGG 
TCTATTTCTC TTAATAATCT TTTGTAGTTT 
CATAGTTTTG ATGCTAAATG GTATTTTAAA 
AATAGAAATA CAATTGATGT TGAACTTGTA 
ATGGTGTTTT TGTAAATTAC ATCAACAGTC 
TTC 

Seq ID NO: 154 Protein sequence: 
Protein Accession #: BAA11503.1 

1 11 21 31 41 51 

I I I I I I 

MFCEKAMELI RELHRAPEGQ LPAFNEDGLR QVLEEMKALY EQNQSDVNEA KSGGRSDLIP 
TIKFRHCSLL RNRRCTVAYL YDRLLRIRAL RWEYGSVLPK ALRFHMAAEE MEWFNNYKRS 
LATYMRSLGG DEGLDITQDM KPPKSLYIEV RCLKDYGEFE VDDGTSVIOjK KNSQHFLPRW 
KCEQLIRQGV LEHILS 

Seq ID NO: 155 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 149-709' 



CTCCTGACCT 
ACCGCGGCCA 

cAccrrcACT 

GACTGCCCTT 
TGGGTTTATT 
TACCATATTT 
TGTGTCATAT 
CAGGCTGTGT 
GGATTACAGG 
TTTGGCTATG 
GCTTATGOCT 
GAGTTCCAGA 
TAGCCAGGTG 
ATCACTTGAA 
CCTGGGTGAC 
ACAACAACAA 
TAAAACTGTT 
TCTGTCTTCC 
TCAGTGTACA 
ATTTCAAATT 
TCCTTCAGCC 
ATGTGTTCTA 



CAAGTGACCC 
GCCTATGATC 
T T T TC TTGQG 
TGCTCTATCA 
TCAGGACTCT 
GTATGTAGTG 
TTTTGCTGTT 
TGAACTCCTG 
CGTGAGCCTT 
TTAAGTCCTT 
GTAATCCTAG 
CCAGCCCGGG 
TGGTGGTGCA 
CCCCAGAGGT 
AAAGTGAGAC 
CAACAAAAAC 
GGGAGAATTG 
TAGGTATTAA 
GGTCTACCAT 
CTAACCACTT 
TTGCTAAACT 
TGAATAAAGA 



ACCTTGGCCT 
CATTTTGAAT 
AATATAGATA 
CCTTTGCATT 
GTTTTGTTCC 
TATGTAATTT 
GTTTGTATTT 
AGCTAAAGCA 
GGTGCTGGCC 
TGCTTTTGAT 
AACTTTGGGA 
CCTATGGCAA 
TGCCTGTAGT 
CAAGACTGCA 
TCTATCTCAA 
CCCTGTTGGG 
ACATCTTAAT 
TGTTTTGTCT 
GTCAGCATTT 
GTTGCTAGTA 
GTGAGTTCTC 
GTTTTACTCC 



GTTCGGCGCC 
CGAAAGGAGT 
AAGGCCGCGG 
CGAGCTGCAT 
AGTTCTGGAG 
GTCAGGTGGA 
AAATCGACGC 
ATGGGAATAT 
GGAGTGGTTT 
TGAAGGTTTG 
ATGCAGTGGC 
CAACCTCCAC 
GCACTTCAGT 
AGCTGATCAG 
CAGGCTTCAC 
CCCTCTTTGA 
AAGTATAATT 
ACTCTTTTTT 
TCTCAAACTC 
AGGCGTGAGC 
CAGCATTCCT 
AATCTCTTAA 
TTTTCTGGTC 
AATCACATGC 
. ACATACTAAT 
TCCACCATGG 
GAAGGGGACA 
GATTGACTGA 
GACATTTTAA 
GCATCCGAGA 
AGTGTTATAA 
TTGTTTTTTC 
GTGCAGTGGC 
TGTCTCAGCC 
TGTATTTTTA 
CTCAAGTGAC 
CAGCCTATGA 
CTTTTTCTTG 
TTTGCTCTAT 
TTTCAGGACT 
TTGTATGTAG 
ATTTTTGCTG 
GTTGAACTCC 
GGCGTGAGCC 
TGTTAAGTCC 
CTGTAATCCT 
GACCAGCCCG 



11 
I 

AAAGCGCGGA 
GAGGCGCCGA 
GAGTGGGAAG 
CGCGCGCCCG 
GAGATGAAAG 
CGAAGTGATT 
TGCACTGTAG 
GGTAGCGTCT 
AATAATTATA 
GACATTACAC 
GCGATCTCGG 
CTCCCAGGTC 
CCTATTAAAA 
ACAAGGAGTC 
TCAACTCATG 
TTTTAGAAGC 
TGCTAACTAT 
GGTTTTGGTT 
CTGGCCTCAA 
CACTGCACCC 
ACAGTTGTTA 
ATAAGCAGTC 
ATGTGTATTG 
AAGTGAAGAT 
TTATCATCTG 
TGGCTGGTGT 
GTGAAATTGG 
AAAGTCACAT 
ATTTTGATGA 
AATCTTTTCC 
TTTTAAGCTT 
GTTTGTTTCT 
GTGATCTTGG 
TCCCAAGTAG 
GTAGAGACAG 
CCACCTTGGC 
TCCATTTTGA 
GGAATATAGA 
CACCTTTGCA 
CTGTTTTGTT 
TGTATGTAAT 
TTGTTTGTAT 
TGAGCTAAAG 
TTGGTGCTGG 
TTTGCTTTTG 
AGAACTTTGG 
GGCCTATGGC 



21 
I 

GCGGAGGCCG 
GAGCCCAGAT 
CGTCCGCCAT 
AAGGGCAACT. 
CTTTGTATGA 
TGATACCAAC 
CATACCTGTA 
TGCCAAATGC 
AAAGATCTCT 
AGGATATGAA 
CTCAACCTGC 
CGGTGTCTAA 
AAAAATAGCC 
CTGGAGCACA 
GACTCCTCTG 
TATAGACATT 
TAAGGACTTT 
TTGTTTTGTA 
GCAGTCCTCC 
GGCCCCTACT 
CAGTGTGTTT 
ACTTGGCTGG 
TACAAGCTAG 
GATGGTCTGT 
GCTATTTGGG 
GGCTTGTGGC 
GGCTAGAGAA 
GAAGAGTTGA 
AATCCAGTTT 
CATCCCAAGA 
TATACTTTGG 
TTGTTTTGAG 
CTCACTGCAA 
CTGGGATTAC 
AGTTTTACCA 
CTCCCAAAGT 
ATGAATTTTT 
TATCCAGCTG 
TTTTTGTTAA 
CCATTGACCT 
TTTCTAATAA 
TTTTTGTAGA 
CAATACACTT 
CCCAGTGTAC 
ATGTGAAATT 
GAGGCCTAGA 
AAAACTCCGT 



31 

I 

AGGCGAGAGC 
ACCATTTTGG 
GTTCTGCGAA 
GCCTGCCTTC 
ACAAAACCAG 
TATCAAATTT 
TGACCGCTTG 
ATTAOGATTT 
TGCTACTTAT 
ACCACCAAAA 
AACCTCCACC 
AAGACTATGG 
AGCACTTTTT 
TCCTGTCATG 
TACTCACTCT 
GTTTAAGATA 
CTTTTTTTAA 
GAGACTGTCT 
CACCTTAGCT 
CCTTTTTCTA 
TTTAAATGAA 
ACAGGAAGAA 
AGAGCTGAAT 
AGAAATTTTC 
AAGGAAGGAC 
TATGGGGTGA 
GGAACTTTGT 
TTGTCTTTTA 
ATTCGTTTGT 
TCACAATTTT 
TCTATGACCC 
ATGGAGTCTT 
TCTCTATCCC 
AGGCACAGGC 
TGTTGGCCAG 
TTTGGGATTA 
TATATGGTGC 
TTTCACTACC 
AAAGTAGTTG 
GTTTTTCTCT 
TTCTTGAAAC 
GATGGGGTTT 
GCCTCGTCCT 
CACATTTCTT 
TGGGAACAGG 
TGGGTGGATC 
CTCTACAAAA 



41 
I 

CTGGCGCTGT 
CGTGAGAGCT 
AAAGCCATGG 
AACGAGGATG 
TCTGATGTGA 
CGACACTGTT 
CTTCGGATCA 
CACATGGCTG 
ATGAGGTCAC 
AGCCTATATA 
TCCCAGGTTC 
AGAATTTGAA 
ACCTCGATGG 
ACCATGCGCC 
CTCCACCACT 
ACTAAGAATA 
TGTTGTACAC 
CACTATGTTG 
TCTCAAAGTG 
ATAAGCTGTA 
AGTAAACATG 
GGTAGATCCT 
TTCTGAGATA 
AGTATATATA 
ACACATGGAT 
TCACCAGTAT 
ACAGTTTTCC 
ATGGTATGTT 
TCTTTTATGC 
TTTTCCTTTT 
GTTTTTTTTT 
GTTCTGTCAC 
CTGGGTTCAA 
CGCCACGCCT 
GCTGGTTTCA 
CAAGTGTGGG 
AAGGTGTCAA 
ATTTTTTGAA 
TCAATGTATA 
CCTGAATGCC 
AGATAGTATT 
CACCGTGTTG 
CCCCATGTGC 
TTTGAGATTT 
CAGGGTGTGG 
ACTTGAGCTC 
AATAGAAAAA 



2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 



PCT/US02/12476 



51 
I 

AGGACTAGAA 
GGTGGTTGGC 
AACTGATCCG 
GACTCAGACA 
ATGAAGCAAA 
CTCTGTTAAG 
GAGCACTCAG 
CTGAAGAAAT 
TGGGAGGAGA 
TTGAAGCTGG 
ACCTCAACTG 
GTTGATGATG 
AAATGTGAGC 
GAGGCACTTC 
CCCTTCACCT 
CTTGGCTAAG 
TATTCTTCCT 
CCCAAGCTGG 
TTGAGATCAC 
TCTGTAATCA 
GTTACATTTG 
GTGTGTCTTG 
CACATTTTCA 
ATGTTTAATG 
TTTGCACATT 
CACCACTTTG 
CTGAGATTCA 
TTAAACAGCT 
TTTGGGTGTT 
TACTTCTAGA 
TTGTTTTGTT 
CCAGGCTGGG 
GTGATTCTCT 
GGCTAATTTT 
AACTCCTGAC 
CCACCGCGGC 
TCCACCTTCA 
AGGACTGCCC 
TGTGGGTTTA 
AATACCATAT 
AATGTGTCAT 
GCCAGGCTGT 
TGGGATTACA 
GTTTTGGCTA 
TGGCTTATGC 
AGGAGTTCCA 
ATTAGCCAGG 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1660 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 



245 



WO 02/086443 

TGTGGTGGTG CATGCCTGTA GTCACAGTTA CACGGCAGGC TGAGGTGGGA GGATCACTTG 2880 

AACCCCAGAG GTCAAGACTG CAGTGAGCTG AGATCACACC ACTGTACTCC AGCCTGGGTG 2940 

ACAAAGTGAG ACTCTATCTC AAAAAGAAAT TAGGATCAAT TTGTCAATTT CTACAACAAC 3000 

AACAACAAAA ACCCCTGTTG GGCACCTTGA TTGAGATTGC ATTGAATTTA TATAAAACTG 3060 

TTGGGAGAAT TGACATCTTA ATAATATTGA GTCTTCTGGC CTATAAACAA GGTCTGTCTT 3120 

CCTAGGTATT AATGTTTTGT CTTCTATTTC TCTTAATAAT CTTTTGTAGT TTTCAGTGTA 3180 

CAGGTCTACC ATGTCAGCAT TTCATAGTTT TGATGCTAAA TGGTATTTTA AAATTTCAAA 3240 

TTCTAACCAC TTGTTGCTAG TAAATAGAAA TACAATTGAT GTTGAACTTG TATCCTTCAG 3300 

CCTTGCTAAA CTGTGAGTTC TCATGGTGTT TTTGTAAATT ACATCAACAG TCATGTGTTC 3360 
TATGAATAAA GAGTTTTACT CCTTC 

Seq ID NO: 156 Protein sequence: 
Protein Accession #: Bos sequence 

1 11 21 31 41 51 

I I I I I I 

MFCEKAMELI RELHRAPEGQ LPAFNEDGLR QVLEEMKALY EQNQSDVNEA KSGGRSDLIP 60 

TIKFRHCSLL RNRRCTVAYL YDRLLRIRAL RWEYGSVLPN ALRFHMAAEE MEWFNNYKRS 120 

LATYMRSLGG DEGLDITQDM KPPKSLYIEA GCSGAISAQP ATSTSQVHLN CNLHLPGPVS 180 
KRLWRI 

Seq IO NO: 157 DNA sequence 

Nucleic Acid Accession 8: Eos sequence 

Coding sequence: 148-621 

1 11 21 31 41 51 

I I I I I I 

TTCGGCGCCA AAGCGCGGAG CGGAGGCCGA GGCGAGAGCC TGGCGCTGTA GGACTAGAAC • 60 

GAAAGGAGTG AGGCGCCGAG AGCCCAGATA CCATTTTGGC GTGAGAGCTG GTGGTTGGCA 120 

AGGCCGCGGG AGTGGGAAGC GTCCGCCATG TTCTGCGAAA AAGCCATGGA ACTGATCCGC 180 

GAGCTGCATC GCGCGCCCGA AGGGCAACTG CCTGCCTTCA ACGAGGATGG ACTCAGACAA 240 

GTTCTGGAGG AGATGAAAGC TTTGTATGAA CAAAACCAGT CTGATGTGAA TGAAGCAAAG 300 

TCAGGTGGAC GAAGTGATTT GATACCAACT ATCAAATTTC GACACTGTTC TCTGTTAAGA 360 

AATCGACGCT GCACTGTAGC ATACCTGTAT GACCGCTTGC TTCGGATCAG AGCACTCAGA 420 

TGGGAATATG GTAGCGTCTT GCCAAATGCA TTACGATTTC ACATGGCTGC TGAAGAAGTC 480 

CGGTGTCTAA AAGACTATGG AGAATTTGAA GTTGATGATG GCACTTCAGT CCTATTAAAA 540 

AAAAATAGCC AGCACTTTTT ACCTCGATGG AAATGTGAGC AGCTGATCAG ACAAGGAGTC 600 

CTGGAGCACA TCCTGTCATG ACCATGCGCC GAGGCACTTC CAGGCTTCAC TCAACTCATG 660 

GACTCCTCTG TACTCACTCT CTCCACCACT CCCTTCACCT CCCTCTTTGA TTTTAGAAGC 720 

TATAGACATT GTTTAAGATA ACTAAGAATA CTTGGCTAAG AAGTATAATT TGCTAACTAT 780 

TAAGGACTTT CTTTTTTTAA TGTTGTACAC TATTCTTCCT ACTCTTTTTT GGTTTTGGTT 840 

TTGTTTTGTA GAGACTGTCT CACTATGTTG CCCAAGCTGG TCTCAAACTC CTGGCCTCAA 900 

GCAGTCCTCC CACCTTAGCT TCTCAAAGTG TTGAGATCAC AGGCGTGAGC CACTGCACCC 960 

GGCCCCTACT CCTTTTTCTA ATAAGCTGTA TCTGTAATCA CAGCATTCCT ACAGTTGTTA 1020 

CAGTGTGTTT TTTAAATGAA AGTAAACATG GTTACATTTG AATCTCTTAA ATAAGCAGTC 1080 

ACTTGGCTGG ACAGGAAGAA GGTAGATCCT GTGTGTCTTG TTTTCTGGTC ATGTGTATTG 1140 

TACAAGCTAG AGAGCTGAAT TTCTGAGATA CACATTTTCA AATCACATGC AAGTGAAGAT 1200 

GATGGTCTGT AGAAATTTTC AGTATATATA ATGTTTAATG ACATACTAAT TTATCATCTG 1260 

GCTATTTGGG AAGGAAGGAC ACACATGGAT TTTGCACATT TCCACCATGG TGGCTGGTGT 1320 

GGCTTGTGGC TATGGGGTGA TCACCAGTAT CACCACTTTG GAAGGGGACA GTGAAATTGG 1380 

GGCTAGAGAA GGAACTTTGT ACAGTTTTCC CTGAGATTCA GATTGACTGA AAAGTCACAT 1440 

GAAGAGTTGA TTGTCTTTTA ATGGTATGTT TTAAACAGCT GACATTTTAA ATTTTGATGA 1500 

AATCCAGTTT ATTCGTTTGT TCTTTTATGC TTTGGGTGTT GCATCCGAGA AATCTTTTCC 1560 

CATCCCAAGA TCACAATTTT TTTTCCTTTT TACTTCTAGA AGTGTTATAA TTTTAAGCTT 1620 

TATACTTTGG TCTATGACCC GTTTTTTTTT TTGTTTTGTT TTGTTTTTTC GTTTGTTTCT 1680 

TTGTTTTGAG ATGGAGTCTT GTTCTGTCAC CCAGGCTGGG GTGCAGTGGC GTGATCTTGG 1740 

CTCACTGCAA TCTCTATCCC CTGGGTTCAA GTGATTCTCT TGTCTCAGCC TCCCAAGTAG 1800 

CTGGGATTAC AGGCACAGGC CGCCACGCCT GGCTAATTIT TGTATTTTTA GTAGAGACAG 1860 

AGTTTTACCA TGTTGGCCAG GCTGGTTTCA AACTCCTGAC CTCAAGTGAC CCACCTTGGC 1920 

CTCCCAAAGT TTTGGGATTA CAAGTGTGGG CCACCGCGGC CAGCCTATGA TCCATTTTGA 1980 

ATGAATTTTT TATATGGTGC AAGGTGTCAA TCCACCTTCA CTTTTTCTTG GGAATATAGA 2040 

TATCCAGCTG TTTCACTACC ATTTTTTGAA AGGACTGCCC TTTGCTCTAT CACCTTTGCA 2100 

TTTTTGTTAA AAAGTAGTTG TCAATGTATA TGTGGGTTTA TTTCAGGACT CTGTTTTGTT 2160 

CCATTGACCT GTTTTTCTCT CCTGAATGCC AATACCATAT TTGTATGTAG TGTATGTAAT 2220 

TTTCTAATAA TTCTTGAAAC AGATAGTATT AATGTGTCAT ATTTTTGCTG TTGTTTGTAT 2280 

TTTTTGTAGA GATGGGGTTT CACCGTGTTG GCCAGGCTGT GTTGAACTCC TGAGCTAAAG 2340 

CAATACACTT GCCTCGTCCT CCCCATGTGC TGGGATTACA GGCGTGAGCC TTGGTGCTGG 2400 

CCCAGTGTAC CACATTTCTT TTTGAGATTT GTTTTGGCTA TGTTAAGTCC TTTGCTTTTG 2460 

ATGTGAAATT TGGGAACAGG CAGGGTGTGG TGGCTTATGC CTGTAATCCT AGAACTTTGG 2520 

GAGGCCTAGA TGGGTGGATC ACTTGAGCTC AGGAGTTCCA GACCAGCCCG GGCCTATGGC 2580 

AAAACTCCGT CTCTACAAAA AATAGAAAAA ATTAGCCAGG TGTGGTGGTG CATGCCTGTA 2640 

GTCACAGTTA CACGGCAGGC TGAGGTGGGA GGATCACTTG AACCCCAGAG GTCAAGACTG 2700 

CAGTGAGCTG AGATCACACC ACTGTACTCC AGCCTGGGTG ACAAAGTGAG ACTCTATCTC 2760 

AAAAAGAAAT TAGGATCAAT TTGTCAATTT CTACAACAAC AACAACAAAA ACCCCTGTTG 2820 

GGCACCTTGA TTGAGATTGC ATTGAATTTA TATAAAACTG TTGGGAGAAT TGACATCTTA 2880 

ATAATATTGA GTCTTCTGGC CTATAAACAA GGTCTGTCTT CCTAGGTATT AATGTTTTGT 2940 

CTTCTATTTC TCTTAATAAT CTTTTGTAGT TTTCAGTGTA CAGGTCTACC ATGTCAGCAT 3000 

TTCATAGTTT TGATGCTAAA TGGTATTTTA AAATTTCAAA TTCTAACCAC TTGTTGCTAG 3060 

TAAATAGAAA TACAATTGAT GTTGAACTTG TATCCTTCAG CCTTGCTAAA CTGTGAGTTC 3120 

TCATGGTGTT TTTGTAAATT ACATCAACAG TCATGTGTTC TATGAATAAA GAGTTTTACT 3180 
CCTTC 



Seq ID NO: 158 Protein sequence: 
Protein Accession ft: Eos sequence 



1 11 21 31 41 51 

I I I I I I 



.246 



WO 02/086443 

MFCEKAMELI RELHRAPEGQ LPAFNEDGLR QVLEEMKALY EQNQSDVNEA KSGGRSDLIP 60 

TIKFRHCSLL RNRRCTVAYL YDRLLRIRAL RWBYGSVL.PN ALRFHMAAEE VBCLKDYGEF 120 
BVDDGTSVLL KKNSQHFLPR WKCEQLIRQG VLEHILS 

Seq ID NO: 159 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 149-229 

1 11 21 31 41 51 

GTTOGGCGCC AAAGCGCGGA GCGGAGGCCG AGGOGAGAGC CTGGCGCTGT AGGACTAGAA 60 

CGAAAGGAGT GAGGCGCCGA GAGCCCAGAT ACCATTTTGG OGTGAGAGCT GGTGGTTGGC 120 

AAGGCCGCGG GAGTGGGAAG CGTCCGCCAT GTTCTGCGAA AAAGCCATGG AACTGATCCG 1B0 

CGAGCTGCAT CGCGCGCCCG AAGGGCAACT GCCTGCCTTC AACAATTAGC TGGGTGTGGT 240 

GGCACACACC TGTAGTCCCA GCAACTTAGG AGGCTGAAGT GAGAGGATTG CATGGCTCCA 300 

GGAAGTTGAA ACTGCAGTGA ACTGTGGTCA CGCTATTACA CTCCAGCCTG GGTGACAGAC 360 

TGAATCCCTG TCTCAAAAAG GAAAAGGAGG ATGGACTCAG ACAAGTTCTG GAGGAGATGA 420 

AAGCTTTGTA TGAACAAAAC CAGTCTGATG TGTTCTCTGT TAAGAAATCG ACGCTGCACT 480 
GTAGCATACC TGTATGACCG CTTGCTTCGG ATCAGAGCAC TCAGATGG 

Seq ID NO: 160 Protein sequence: 
Protein Accession ft: Eos sequence 

1 11 21 31 41 51 

I I I I I I 

ATGTTCTGCG AAAAAGCCAT GGAACTGATC CGCGAGCTGC ATCGCGCGCC CGAAGGGCAA 60 
CTGCCTGCCT TCAACAATTA G 

Seq ID NO: 161 DNA sequence 
Nucleic Acid Accession ft: U10694 
Coding sequence: 1333-2280 

1 11 21 31 41 51 

I I I I I I 

GGATCCGGCC GGATCTCAGG GAGGTGAGGA CTTTGTTCTC AGAGGGTGTG TGTGGACAAA 60 

ACAGGGAGGC CCTGTGTTCG ACAGACACAG TGGTCCCAGG ATTGGAGAGC AGTCCAGGTG 120 

AGGAACCTAA GGGAGGATCG AGGGTACCTC CAGGCCAGAG AAACTCTCAG ATCAAGAGAG 180 

TTTGCCCTGC CCCTACTGTC ACCCCAGAGA GCCCGGGCAG GGCTGTCTGC TGAGGTCCCT 240 

CCTTTATCCT GGGATCACTG GTGTCGGGGA GGGCTGGCCT TGGTCTGAGG GGGCTGCACT 300 

CACGTCAGCA GAGGGAGGGT CCCAGGCCCT GCCAGGAGTC CAGGTGCAGA CTGAGGGGAC 360 

CCCACTCACC AAACACAGAG GACCTAGCCC CACCCTGCCC CTTGTGTCAG CTGAGGGAAG 420 

CCGCTGGGTG GATGGACTCC CCTCACTTCC TCTTCAGGTG TCTCCTGGAG ATAGGGCCTC 480 

AGGTCAACAG AGGGAGGGTT CCAGACCCTG CAGGCATCAA GATGAGGACC AGGCAGTATC 540 

CTCACCCCAG GACACATGGA CCCCATTGAA TTTAGACATC TCTTACTGTA CTTCCGAGGA 600 

AACCCTGGGC AGGTGTGGGC AGATGTTGGT TGGGGCATGT CCTTCTGTTC CATATCAGGG 660 

ATGTGAGCTC CTGATCTGAG AGACTCTCAG GCAAGTAGAG GAGTAGAGTC CAGTCCCTGC 720 

CAGGAGAAAG GTCAGGGCCC TGAGTGAGCG CAGAGGGGAC CATCCACCCC AAAAGTGTGT 780 

AGAACTCAAG AGTGTCCAGC CCGCCCTCTT GACAGCACTG AGGGACCGGG GCTCTGCCTG 840 

CAGTCTGCAG CCTAAGGGCC CCTCGATTCC TCTTCCAGGA GCTCCAGGAA GCAGGCAGGC 900 

CTTGGTCTGA GACAGTGTCC TCAGGTCGCA GAGCAGAGGA GACCCAGGCA GTGTCAGCAG 960 

TGAAGGTGAA GTGTTCACCC TGAATGTGCA CCAAGGGCCC CACCTGCCCC AGCACACATG 1020 

GGACCCCATA GCACCTGGCC CCATTCCCCC TACTGTCACT CATAGAGCCT TGATCTCTGC 1080 

AGGCTAGCTG CACGCTGAGT AGCCCTCTCA CTTCCTCCCT CAGGTTCTCG GGACAGGCTA 1140 

ACCAGGAGGA CAGGAGCCCC AAGAGGCCCC AGAGCAGCAC TGACGAAGAC GTGTAAGTCA 1200 

GCCTTTGTTA GAACCTCCAA GGTTCGGTTC TCAGCTGAAG TCTCTCACAC ACTCCCTCTC 1260 

TCCCCAGGCC TGTGGGTCTC CATCGCCCAG CTCCTGCCCA CGCTCCTGAC TGCTGCCCTG 1320 

ACCAGAGTCA TCATGTCTCT CGAGCAGAGG AGTCCGCACT GCAAGCCTGA TGAAGACCTT 1380 

GAAGCCCAAG GAGAGGACTT GGGCCTGATG GGTGCACAGG AACCCACAGG CGAGGAGGAG 1440 

GAGACTACCT CCTCCTCTGA CAGCAAGGAG GAGGAGGTGT CTGCTGCTGG GTCATCAAGT 1500 

CCTCCCCAGA GTCCTCAGGG AGGCGCTTCC TCCTCCATTT CCGTCTACTA CACTTTATGG 1560 

AGCCAATTCG ATGAGGGCTC CAGCAGTCAA GAAGAGGAAG AGCCAAGCTC CTCGGTCGAC 1620 

CCAGCTCAGC TGGAGTTCAT GTTCCAAGAA GCACTGAAAT TGAAGGTGGC TGAGTTGGTT 1680 

CATTTCCTGC TCCACAAATA TCGAGTCAAG GAGCCGGTCA CAAAGGCAGA AATGCTGGAG 1740 

AGCGTCATCA AAAATTACAA GCGCTACTTT CCTGTGATCT TCGGCAAAGC CTCCGAGTTC 1800 

ATGCAGGTGA TCTTTGGCAC TGATGTGAAG GAGGTGGACC CCGCCGGCCA CTCCTACATC 1860 

CTTGTCACTG CTCTTGGCCT CTCGTGCGAT AGCATGCTGG GTGATGGTCA TAGCATGCCC 1920 

AAGGCCGCCC TCCTGATCAT TGTCCTGGGT GTGATCCTAA CCAAAGACAA CTGCGCCCCT 1980 

GAAGAGGTTA TCTGGGAAGC GTTGAGTGTG ATGGGGGTGT ATGTTGGGAA GGAGCACATG 2040 

TTCTACGGGG AGCCCAGGAA GCTGCTCACC CAAGATTGGG TGCAGGAAAA CTACCTGGAG 2100 

TACCGGCAGG TGCCCGGCAG TGATCCTGCG CACTACGAGT TCCTGTGGGG TTCCAAGGCC 2160 

CACGCTGAAA CCAGCTATGA GAAGGTCATA AATTATTTGG TCATGCTCAA TGCAAGAGAG 2220 

CCCATCTGCT ACCCATCCCT TTATGAAGAG GTTTTGGGAG AGGAGCAAGA GGGAGTCTGA 2280 

GCACCAGCCG CAGCOGGGGC CAAAGTTTGT GGGGTCAGGG CCCCATCCAG CAGCTGCCCT 2340 

GCCCCATGTG ACATGAGGCC CATTCTTCGC TCTGTGTTTG AAGAGAGCAA TCAGTGTTCT 2400 

CAGTGGCAGT GGGTGGAAGT GAGCACACTG TATGTCATCT CTGGGTTCCT TGTCTATTGG 2460 

GTGATTTGGA GATTTATCCT TGCTCCCTTT TGGAATTGTT CAAATGTTCT TTTAATGGTC 2520 

AGTTTAATGA ACTTCACCAT CGAAGTTAAT GAATGACAGT AGTCACACAT ATTGCTGTTT 2580 

ATGTTATTTA GGAGTAAGAT TCTTGCTTTT GAGTCACATG GGGAAATCCC TGTTATTTTG 2640 

TGAATTGGGA CAAGATAACA TAGCAGAGGA ATTAATAATT TTTTTGAAAC TTGAACTTAG 2700 

CAGCAAAATA GAGCTCATAA AGAAATAGTG AAATGAAAAT GTAGTTAATT CTTGCCTTAT 2760 

ACCTCTTTCT CTCTCCTGTA AAATTAAAAC ATATACATGT ATACCTGGAT TTGCTTGGCT 2820 

TCTTTGAGCA TGTAAGAGAA ATAAAAATTG AAAGAATAAT TTTTCCTGTT CACTGGCTCA 2880 
TTTTTTCTTC AGACACGCAC TGAACATCTG TTATTCGGAA CACCCTGGGT T 



Seq ID NO: 162 Protein sequence: 
Protein Accession 8: AAA68877.1 



247 



WO 02/086443 

1 11 21 31 41 51 

I I I I I I 

MSLEQRSPHC KPDEDLEAQG EDLGLMGAQE PTGEEEETTS SSDSKEEEVS AAGSSSPPQS 60 

PQGGASSSIS VYYTLWSQFD EGSSSQEEEE PSSSVDPAQL EFMFQEALKL KVAELVHFLL 120 

HKYRVKEPVT KAEMLESVIK NYKRYFPVIF GKASEFMQVI FGTDVKEVDP AGHSYILVTA 180 

LGLSCDSMLG DGHSMPKAAL LIIVLGVILT KDNCAPEEVI WEALSVMGVY VGKEHMFYGE 240 

PRKLLTQDWV QENYLEYRQV PGSDPAHYEF LWGSKAHAET SYEKVINYLV MLNAREPICY 300 
PSLYEEVLGE EQEGV 

Seq ID NO i 163 DNA aequence 
Nucleic Acid Accession #: AF292100 
Coding sequences 30-809 

1 11 21 31 41 51 

I I I I I I 

GGGGGGGGAG AGGCCTGGAG GACACCAACA TGAACAAGTT GAAATCATCG CAGAAGGATA 60 

AAGTTCGTCA GTTTATGATC TTCACACAAT CTAGTGAAAA AACAGCAGTA AGTTGTCTTT 120 

CTCAAAATGA CTGGAAGTTA GATGTTGCAA CAGATAATTT TTTCCAAAAT CCTGAACTTT 180 

ATATACGAGA GAGTGTAAAA GGATCATTGG ACAGGAAGAA GTTAGAACAG CTGTACAATA 240 

GATACAAAGA CCCTCAAGAT GAGAATAAAA TTGGAATAGA TGGCATACAG CAGTTCTGTG 300 

ATGACCTGGC ACTCGATCCA GCCAGCATTA GTGTGTTGAT TATTGCGTGG AAGTTCAGAG 360 

CAGCAACACA GTGCGAGTTC TCCAAACAGG AGTTCATGGA TGGCATGACA GAATTAGGAT 420 

GTGACAGCAT AGAACAACTA AAGGCCCAGA TACCCAAGAT GGAACAAGAA TTGAAAGAAC 480 

CAGGACGATT TAAGGATTTT TACCAGTTTA CTTTTAATTT TGCAAAGAAT CCAGGACAAA 54 0 

AAGGATTAGA TCTAGAAATG GCCATTGCCT ACTGGAACTT AGTGCTTAAT GGAAGATTTA 600 

AATTCTTAGA CTTATGGAAT AAATTTTTGT TGGAACATCA TAAACGATCA ATACCAAAAG 660 

ACACTTGGAA TCTTCTTTTA GACTTCAGTA CGATGATTGC AGATGACATG TCTAATTATG 720 

ATGAAGAAGG AGCATGGCCT GTTCTTATTG ATGACTTTGT GGAATTTGCA CGCCCTCAAA 780 

TTGCTGGGAC AAAAAGTACA ACAGTGTAGC ACTAAAGGAA CCTTTTAGAA TGTACATAGT 840 

CTGTACAATA AATACAACAG AAAATTGCAC AGTCAATTTC TGCTGGCTGG ACTGAACTGA 900 

AGATCAATCC TCACAATTCA GACTGAGGGT TGAGACAAAA CTTTAAGGAT ACATCTTGGA 960 

CCATATCGTA TTTCATTCTT CTAATGGTGG TTTGGGCTTG TCTTCTAGTC TGGGCCGCTC 1020 

TAAACATTTA TAATTCCAAC ATTGTGGATT TCATCTTATA TCTGTGGACC ATCCTAGTTT 1080 

ATTCTCCCAT AAGTCTTAGA AGCTTTATGG TGATTATTTT GAGGTTTTCA TTCTCGCATA 1140 

AAGCACAATG CTGTCTTCAT CAGAAAACAG TTGGCATAAG AATTAAACAT ATGAACATCA 1200 

CAAAACAATT TATAAAAACT TCTTAAATAT ACGCTTTGGG CTAGTTGCAA AGACTATGCT 1260 

AATAGCACTT CCAGTGAGAG TGATATATTT AAGTGTACTG GATCTGGAAT GGTGTTTTGG 1320 

TTTGGGGGGA ATTTTTTTTT TTTCCTGGCA AATCACATAT GTTGTTGATG TGAGTATCTG 1380 

ATGAAAAAAC AATGTCAGAA TAACCGACAT GAAAATTTTT TAGGATAACT TGGTGCCTAC 1440. 

CTGAAAAATG TATTGTGTTT TAGACTCTTG ATTTCAAAAG GTTCCACAGA ACTAGTCTGC 1500 

GCTTACCTTA CCCATGTTTA TATATAGCTG TCCTACAGGG AGCTTTTATT TAGAAAATGT 1560 

CTGCATAATG TTAGATTCTT CTCCTGTCTA CATTATGCAC TACATAATTG GACTTCATTA 1620 

TGCTTTTGAA ATGCTTATCT GCCTGTCACA TAAGTTAAAC TATTTAATTT GTTTTGAATG 1680 

TTTTGGATTG CTACACAATA CAATATTCTA AATTTAGGCA TGAGGGTTTT TTTGTTTTAT 1740 

TTTTACTTTT TTTTTGTCAT TGCACTATGG AACACAAATG AAATTCTCTT AATTTATAAG 1800 

AAGATAGTAG GAGTTAAATT TTGAAAATGG TTGTGATGAG CCACGAAATT CAATCTTTAT 1860 

AATATAGGTA CTGCTCTTTC AGACAAACAG TCCATTTTTA ATGACTTCTT ATTTTGTTGA 1920 

AATTACTTTA ACTGCTAATC ACTGTGGTTG CCAAATATTT ACTTCAGAAG CAAAGATTTT 1980 

CAAACAAGCA TACACGATGC AAAATACCAG TCTGGCTTCT AGTCTATTTA CTGTTTTGTT 2040 

TCACTCAGAT TAGCTCAGTT TTCTCATCAA AGCAGAATGC TATCTTGCGT GTGTGTGTGT 2100 

GTGTGTGTGT GTGTGTGTGT GTATGTGTGT ATATATATAT ATATATATAT ATATATATTT 2160 

TTTTTTTTTT TTTTTTTTAA ATTACAAAAG CCATGAGCTG CTTTTATGCT GAAAATGGTC 2220 

ATTTCCCTGT TCACTTACTG ACATGTGAAG AAGGGTTTCT TGCTTTCTTA AACATTTCCG 2280 

TAAGGCAGGC TAGAAATGTA ATACTTCAAA TGTTTGATGA TTATGGTCTT TTGATAGGAA 2340 

TAGATTCTGC TTGGGATATA TATCCAGGCA CTCTCTAAGG TCTAGGGTTG ATATTAACAA 2400 

AGGAATGTAC TTAGAATAGC AGTACATTTT ATGCAAATAT GGAAATTATT TTAAGAAACA 2460 

ATGACATATC AAAACTGCTT TTTACATGAT TTTGAAATAG ACTAGAAAGC TTTCCCTATA 2520 

GACATATTAA TATTCCAATC ATAACTTTAA TTCAAGAATG CAGTTTTACC AAAAGAAAAA 2580 

TTTGAAAATT TCTATTCAGG CTACTGGAAT TGGTTATTAA AAGAAAAAGG AAAAAGAAGA 2640 

ATCTTGCTGC TTTCAGTATT TCCTGATTTT TTTGTAAATA TAAAGAGGAA CTTCAATTAT 2700 

GAAAAATTTT TAAAAGATAT ATATATCTAT ATATCTATAT ATATGTACTG TTTTGTTTCC 2760 

TGTCTTGAAG ATTTTGAGTT ATGGTTATTG GTTTCAGATT GATTAATTCA CATATGCTGT 2820 

GTTTTCTTTA AAAGTCATAT GGGTTCGTGG CCTAATGCCT TGGATTTTAC ATATTTTTCT . 2880 

TTTTAAATGC AAAACCTTTT CAACAAAATA GTGTTTGTCA TCAGGTTGGT ACTAAACATT 2940 

TATAATTACT GTGTAATTAT AAACAAAAAT ACATAAAGCT TTGAATATAA TTATGTAGCA 3000 

. TAAAAGTTAA GGTTGTTCAC TATGATGGCA TCTTAGAATT AAACAAAACT TTTACTAGGG 3060 

CTGAAAAGAG AAGACTGATT TAATGTGGTG TGATTATTCT GAAGATAAAT GTCTGGCTAC 3120 

. AGGGAATATT TTGTACTAAA AAATGATTAC ACATATGGCT GTGTGTGTTT GAGTCTGTGT 3180 

CTGTGAGAGA GCCAGAGAGA GTGAGAGAGA TTGACAGAGA AAGGGAGAGA CACACACACG 3240 

CCCCTTGAAT TGCTTTAACT CCTAAGTGTT TCAGTCCTCA TTCCGGTAAA CTCCCCATGC 3300 

TGATTCTTTG TTTTAAACTG AACCATAGGT ACAGTTTCCT TTTTGCCAAA TGTCAAAACA 3360 

GGTACAAATT TTAAAATGTA ATGCTTTTTA AATAGAAAAA TGTATAAAAT TAGAAGTGCC 3420 

CACATATAAA AAATACTTGA GATGAAGATT ATCTTTAGTG AATATCATCT GCATATCTCT 3480 

GTAAGTTCAA TTGTGTTTCT TACAGTCCCT GTCATATTAC CAACAGAGGC AATAAAAGCT 3540 
GCAGTGAAAT TG 

Seq ID NO: 164 Protein sequences 
Protein Accession 8: AAG00606 

1 11 21 31 41 51 

I I I I I I 

MNKLKSSQKD KVRQFMIFTQ SSEKTAVSCL SQNDWKLDVA TDNFFQNPEL YIRESVKGSL 60 

DRKKLEQLYN RYKDPQDENK IGIDGIQQFC DDLALDPASI SVLIIAWKFR AATQCEFSKQ 120 

EFMDGMTELG CDSIEQLKAQ IPKMEQELKE PGRFKDFYQF TFNFAKNPGQ KGLDLEMAIA 180 

YWNLVLNGRF KFLDLWNKFL LEHHKRSIPK DTWNLLLDFS TMIADDMSNY DEEGAWPVLI 240 
DDFVEFARPQ IAGTKSTTV 
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Seq ID NOt 16S DMA sequence 
Nucleic Acid Accession #: AF256215 
Coding sequence: 220-2028 

1 11 21 31 41 51 

I l-l I 1 I 

CTCCAGTCCG CATGCTCAGT AGCTGCTGCC GGCCGGGCTG CGGGGCGGCG TCCGCTGCGC 60 

GCCTACGGGC TGCGGTGGCG GCCGCCGCGG CACCCGGCAG GGCCCGCCAG TCCCCGCTTC 120 

CCTGCTCCAG AGCOGCCGCC TGGGCCGGGG CAGGGCGGGC CCGGGGCTCC TCCATGCTGC 180 

CAGCCGCCGG GCTGCGGAGC CGACCAAGTG GCTCCTGCGA TGGCGGCGGA AGAGGAGGCT 240 

GCGGCGGGAG GTAAAGTGTT GAGAGAGGAG AACCAGTGCA TTGCTCCTGT GGTTTCCAGC 300 

CGCGTGAGTC CAGGGACAAG ACCAACAGCT ATGGGGTCTT TCAGCTCACA CATGACAGAG 360 

TTTCCACGAA AACGCAAAGG AAGTGATTCA GACCCATCCC AAGTGGAAGA TGGTGAACAC 420 

CAAGTTAAAA TGAAGGCCTT CAGAGAAGCT CATAGCCAAA CTGAAAAGCG GAGGAGAGAT 480 

AAAATGAATA ACCTGATTGA AGAACTGTCT GCAATGATCC CTCAGTGCAA CCCCATGGCG 540 

CGTAAACTGG ACAAACTTAC AGTTTTAAGA ATGGCTGTTC AACACTTGAG ATCTTTAAAA 600 

GGCTTGACAA ATTCTTATGT GGGAAGTAAT TATAGACCAT CATTTCTTCA GGATAATGAG 660 

CTCAGACATT TAATCCTTAA GACTGCAGAA GGCTTCTTAT TTGTGGTTGG ATGTGAAAGA 720 

GGAAAAATTC TCTTCGTTTC TAAGTCAGTC TCCAAAATAC TTAATTATGA TCAGGCTAGT 780 

TTGACTGGAC AAAGCTTATT TGACTTCTTA CATCCAAAAG ATGTTGCCAA AGTAAAGGAA 840 

CAACTTTCTT CTTTTGATAT TTCACCAAGA GAAAAGCTAA TAGATGCCAA AACTGGTTTG 900 

CAAGTTCACA GTAATCTCCA CGCTGGAAGG ACACGTGTGT ATTCTGGCTC AAGACGATCT 960 

TTTTTCTGTC GGATAAAGAG TTGTAAAATC TCTGTCAAAG AAGAGCATGG ATGCTTACCC 1020 

AACTCAAAGA AGAAAGAGCA CAGAAAATTC TATACTATCC ATTGCACTGG TTACTTGAGA 1080 

AGCTGGCCTC CAAATATTGT TGGAATGGAA GAAGAAAGGA ACAGTAAGAA AGACAACAGT 1140 

AATTTTACCT GCCTTGTGGC CATTGGAAGA TTACAGCCAT ATATTGTTCC ACAGAACAGT 1200 

GGAGAGATTA ATGTGAAACC AACTGAATTT ATAACCCGGT TTGCAGTGAA TGGAAAATTT 1260 

GTCTATGTAG ATCAAAGGGC AACAGCGATT TTAGGATATC TGCCTCAGGA ACTTTTGGGA 1320 

ACTTCTTGTT ATGAATATTT TCATCAAGAT GACCACAATA ATTTGACTGA CAAGCACAAA 1380 

GCAGTTCTAC AGAGTAAGGA GAAAATACTT ACAGATTCCT ACAAATTCAG AGCAAAAGAT 1440 

GGCTCTTTTG TAACTTTAAA AAGCCAATGG TTTAGTTTCA CAAATCCTTG GACAAAAGAA 1500 

CTGGAATATA TTGTATCTGT CAACACTTTA GTTTTGGGAC ATAGTGAGCC TGGAGAAGCA 1560 

TCATTTTTAC CTTGTAGCTC TCAATCATCA GAAGAATCCT CTAGACAGTC CTGTATGAGT 1620 

GTACCTGGAA TGTCTACTGG AACAGTACTT GGTGCTGGTA GTATTGGAAC AGATATTGCA 1680 

AATGAAATTC TGGATTTACA GAGGTTACAG TCTTCTTCAT ACCTTGATGA TTCGAGTCCA 1740 

ACAGGTTTAA TGAAAGATAC TCATACTGTA AACTGCAGGA GTATGTCAAA TAAGGAGTTG 1800 

TTTCCACCAA GTCCTTCTGA AATGGGGGAG CTAGAGGCTA CCAGGCAAAA CCAGAGTACT 1860 

GTTGCTGTCC ACAGCCATGA GCCACTCCTC AGTGATGGTG CACAGTTGGA TTTCGATGCC 1920 

CTATGTGACA ATGATGACAC AGCCATGGCT GCATTTATGA ATTACTTAGA AGCAGAGGGG 1980 

GGCCTGGGAG ACCCTGGGGA CTTCAGTGAC ATCCAGTGGA CCCTCTAGCC TTTGATTTTT 2040 

AACTCCAAAA ATGAGAAACA TTTTAAAGCA TTATTTACGA AAAAACTGTC TCAACTATTC 2100 

TTAAGTACTG TATTGATATT GTTTGTATCT TTTATTAATG TTCTACCACT TTTTATAGAT 2160 

TTGCATCTTC CTGTCACAGG GATGTGGGGA AATACGTTTT CCTCCCAAGA GAACCAAGTT 2220 

TATTATAGAC TCCTTTATTC AGTGAAATGG CTTATAATCC ACTAGTTGCC ATATTTTTGC 2280 

TAAAATATTT CTAACCAAGA ATACTACTTA CATATTGTTT TGGCTTTGTT TTATTTTTGA 2340 

TGCAGTTTTT TTTAGTTGAG GTAATGTAAT ATATTGATGT TTTCCTTTGT GTCTAAGATT 2400 

GATTTATAAT AGTAGGTTTG TATAATTTGG AACATTTTCC ATGCCTTGCG AATTTCCTTA 2460 

ATTGAGGATA GGGCTTACAC ACTTTAAGAA AACAGTGAGT ACTTGAACAT TTAAAGGGAC 2520 

AGTGCAATTT ATAGTCATAA TCACATTGAA TACTGTATTT GATCTTTGGA GACTTAGGCA 2580 

AGCACAGAGC TGGGATATTT ATGCTCAGTT GAGCACTTTA AGATGAATTT TAAGTGAGAT 2640 

GATTTCTTGC TTAAAACTCA GAAAGTCAAA AGAGTTTCAG CTTTCCTTAC AGAAAAGGAA 2700 

GGATCTTGGG CCCTAGATCT TGGGGATTAA CCTCTGCATA TAAGATTTAC TCTTAATAGG 2760 

CCAGACGTGG TGCTCACGCC TGTAATCCCA GTACTTTGGG AGGCTGAGAC GGGCAGATCA 2820 

CTTGAGGTCA GGAGTTCAAG ACCAGCCTGG CCAATATGGT GAAACCCCGT TTCTACTAAA 2880 

AATACAAAAA AAATTACCCA GGCACTCACT CTTGAGGTAA CTAACCAACT CCCACGATAA 2940 

TGACAGTCCA TTCATGAGCG CAAAGGCCTC ATGACCTAAT GGCACACACC TGTAATCCCA 3000 

ACTGCTTGGG . AGGCTGAGGC GAGAGGATTG CTTGAACCTG GGAGGCAGAG GTTGCAGTGA 3060 

GCCGAGATCG CACCACTGCA CTCCAGTCTG GGCAACAGAG TGAGACTTCA TCTCAAAAAA 3120 

AGTAAAAAAA AAGATTTAAT ATAATCACTG AAGATCTCTA TTATAGATAG ATTAGGTTTT 3180 

TGACATTGGA AACATACTTA GGGATAGATT TGTCCTAAAG GAAAAAAGTA GGCCCGGGCA 3240 

GATTAAATGT CTTGTGTAAA GTCACACATT AAATTCAGTC ACACATTAAA TTCATAGAGT 3300 

TTTAAATGTT TAATGTATAT AAACCAGTTT CTTTATACAC ATTTGGGAAA ACATTGGTCT 3360 

CACAGATTAA ATGATTAACT AACTGACCCA GGAACTAGTT GTAGCTTTCT AAGTAATTAG 3420 

GCAATTACAG TTATTGCCTG TAACCAAAGG TAATAAAACA AAATGACAAG TACATGTTTA 3480 

AAATTATGAG GCAATGAGAA ATAATTTAAA AACCAATTTT CTAGTTATAA TTTAAAATTT 3540 

GGAGAGCATT TTTAACAGTA ATTAATCCAG AGGTGGCTCA AATTGAGTAT AAGAATTAAG 3600 

ATTATTTAAA ATACTGCATG TCTACCTTCT CGGGGATCAT ACTTTATAAC ACTTTCTGCT 3660 

TCAGTAGCTC TTCATAGCTT GCCAAGTATG CTCCCATATT TTCTCTCTOG TGCCTCGCAA 3720 

ATGAAAGTCA GATAGGCTGG GAACTCATGG GGCAGCCCTC AGACTTCAAT GTGGGCTTCA 3780 

AATCCAGTTT CCTGTTCTAT ATGGTGCTAC ATCTTTCCAG AAAATTTCCC TCAGAGCCCC 3840 

TCGCCAAAAC AAAGCATTAT TTTGACCCTG CATGCTATTT CTTTAGCTGT AGGTGATAGA 3900 

TTAGAACTTC TGTCAGACAT GTTAATGACA AACATACCAA CAGACAATAA CCAAAGCAAA 3960 

TGTTTCCTTC AAGTGTGAAA TGTGCAGGGG CTCGTGGGCA AGGATGTATT GGCACACTGT 4020 

CCTCTTGAAC TGATAGTGTC CCAGCAATGT TGGAGGTTGG CACCATTCCT GGTCCGACAC 4080 

TTGAGGACCT GAGAGACATC AGGTTTAGAA TGAGCCAAAG AAATCCTACA AGATGGGGAG 4140 

AATTGGTGTG CAGCAGCCTA AGTGTTATAG TTAAGTCTAA AGAAGTATGA AAGATCCCCT 4200 

GTGTTCTCTA AATTGAGCAG AGGGGCCTGC CTACCAATAT CACTTTTTAG GGGACTGAAC 4260 

CATTGCAGGT TAGACTTGGC TTCCAAAGAG TCTGCCTAAG CCAGGGGTGG CAGGGTAGGC 4320 

CATCATAGCT GGATGGCCTC AAAAGCAGAT GGGGGCAGAC TTGCCCTCGT GATGCCAGGA 4380 

TTTGAGAGGC AGAGTTTCTA GAGGGAGACC AGTGCTGCCT CTCACAGTGG CAGTTTTTTC 4440 

TCTTTGCAAG AGGAGGGGCT GTTCAATTCC ATAGACCAGT GGGCAGATAG CCAGTTGAAT 4500 

ACTCTGTGCA TGGTTTGATC CTTTATTAGT TCGCTCTAAT ATTTTTCTGT AGATCCTTTT 4560 

GTCCTGGACT CAAAATCTAA TCCATGCATT GTATGATACC GTAGCTCTCC TAAGGTTTGT 4620 

GTTTCCTTCA AAATGTTTTA GTTTTCTTCA ACTAAATTTG ATTTTTGCTG TTAGAAGTGA 4680 

CATATTTTTA TGGTATACAC TATGTTCCTT TTTTCTACTG CGAGTCAATT TTTTGAATTT 4740 

TCGTGAGAAA GAATATATCT ACAAATTGCA CGAAAGTATC ATAAAAACAG TACTCTAGAG 4800 
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CAGCGCTGTC CAATAGAAAT ATAATCTGAG CCACATGTAT AATTTTATTT TCTTCTAGCC 4860 

ACATTAAAGA AGTAAAAAGA TACAAGTAGA ACTAATTTTA ATGTTTTAAT TCAGTATATC 4920 

CAAAATATCA TTTGAACATG TAATTAATAT AAAATTATTA ATGTGATATT TTACATTCTT 4980 

TTGGTAATAC TAGTCTTCAA AATCTGGTAT GTATCTTACA TTGATAGCAC ATCTCACTTT 5040 

GTACTAGCCA CATTGCAAGT GCTCAGTAGC CACATGTGGC TAGTGGCTAC TGCACTGGAC 5100 

AGCACAGTTC TAGGTTCCAC CCTAACACCC AAGTCCTGTG GATTAGAATC CCAGAATCAG S160 

AGCTGGAAGT AAACATAGAG ATCAAACCTC CTTTTAAAAA TGAGGACGCT GAGGCACAGA 5220 

GTTTAAATGG CTTGCATGAG GTCATACAGC TAAATTCAGC CTCAACAGGG TCTTCTGATT 5280 

CCAGGCACTC TTCCCACTCC ACTACATTAC TGTAGTGGTA ATTCTTAGGG TTAAAAAAAG 5340 

TGTAGAGTAG GCCGGGCGCA GTGGCTCATG CCTGTAATCC CAGCACTTTG GGAGGCCGAA 5400 

GTGGGCGGAT CACGAGGTCA GGAGATCGAG ACCATCCTGG CCAACATGGT GAAACCCCGT 5460 

CTCTACTGAA AATACAAAGC AAAATTAGCC AGGTGTGGTG GCGGGCGCCT GTGGTCCCAG 5520 

CTGCTCTGGA GGCTGAGGCA GAATGGCGTG AACCCAGGAG GCAGAGATGG CAGTGAGCCA 5580 

AGATOGOGCC ACTGCACCCC AGCCTGGGCG ACAGAGCGAG ACTCCATCTC AAAAAAAAAA 5640 

AAAAAAAAAA AAGAAAAGAA AAGAAAAGTC TAGAGAACAT TATATTAAGT GGTTATTATT 5700 

GAAGTAGACC AAAGTTTATA CCATAAGGAT ATTTTTCCTT AAATACCATG TTTGAAGAAC 5760 

AATTATTTAT TGATCCTTGA ATCTGTAAGA TCAAATAACA AGTCTCTATC CATGTTACCA 5820 

AATTTAACCT TTTGAAAATA ATAAACTTTA AAATATCAGA TGTGTTATTA CAGGATGATA 5880 

CTTGGAATCA AGTGAAATGA GTTATATGGT CATCACTAAA TTTAGAAATC TATTGTGAAA 5940 

CAAAGACAAA CAGGAAAGTA CAGAATAGAG ACTTTTAGTA AATAAATGGA ATTTAAAAGA 6000 

AAGTGTTTAT TTACAGTGTC ACGACAGAAA AGGATGTCTT TGTTGTCATA GTCTTTGAGG 6060 

GATCTCCGTA AAATCTGGGG CACAGGTACA AGAAATAGCC AATATTTAGT TCCCAGACCA 6120 

TGTTTAGTAG TGTCCAGTTT CAGATCATGC TGCCAAGAGG TATCTCCCCC TCAGGTGGGT 6180 

CATCACTGAG CCCTGGAATT GGAGACTCAT ACTTGCCCAG CACAATGTTA CGGGCAGACA 6240 

GGCCGACATC TATGATTAGC TAGAAGCCAT AAAGAAAAGC TGCTAAGTGG CCACTAGGTG 6300 

CCACTTTTCT GTTTTTGTAA TGCTTTCATT AGCAGATCTT TTTTTTCCAA GCTCCATGGG 6360 

GCCTATGAGA GGCATTTATG ATTTTTGTGC CTACAATAAG TCAGCCTGTC TGGTGT GAGT 6420 

TGTTTTATGA GAAATGCTTT CCAAGGGAGG TCTAGGAAGA TCCTGACACA TAAGAACTTT 6480 

GGCTTAGAGA GCTTTCCAGG TGTAGTGCCA ATAAAAACTG ACCTGGAAAG AAAACCTGCC 6540 

CAGCACGGAA CATGCTTTCT GAACTCACTT GAGAGTGTAT GGTGTATGTC ACTTCTCATA 6600 

TATTCTTGAG TTTAGATTTG TCTTTTATAC AATTTTTAGC TCTTTTCCAG TTCACTTGTG 6660 

CTCGTCTGTA TATTGGTATT TTTAAATTTT TGTGGTAAAT AATGAAAAGA GTGAAATTAT 6720 

ATTTTATAAT TACTCATTTG TAGTTTTTTT .TTTTAATTTA ATAAACTTCC TCCAAAAAGT 6780 
GCTCCCTTAA AA 

Seq ID NO: 166 Protein sequence t 
protein Accession #: AAG34652 

1 11 21 31 41 51 

I I I I I I 

MAAEEEAAAG GKVLREENQC IAPWSSRVS PGTRPTAMGS FSSHMTEFPR KRKGSDSDPS 60 

QVEDGEHQVK MKAFREAHSQ TEKRRRDKMN NLIEELSAMI PQCNPMARKL DKLTVLRMAV 120 

QHLRSLKGLT NSYVGSNYRP SPLQDNELRH LILKTAEGPL FWGCERGKI LFVSKSVSKI 180 

LNYDQASLTG QSLFDFLHPK DVAKVKEQLS SFDISPREKL IDAKTGLGVH SNLHAGRTRV 240 

YSGSRRSFFC RIKSCKISVK EEHGCLPNSK KKEHRKFYTI HCTGYLRSWP PNIVGMEEER 300 

NSKKDNSNFT CLVAIGRLQP YIVPQNSGEI NVKPTEFITR FAVNGKFVYV DQRATAXLGY 360 

LPQELLGTSC YEYFHQDDHN NLTDKHKAVL QSKEKILTDS YKFRAKDGSF VTLKSQWFSF 420 

TNPWTKELEY IVSVNTLVLG HSEPGEASFL PCSSQSSEES SRGSCMSVPG MSTGTVLGAG 480 

SIGTDIANEI LDLQRLQSSS YLDDSSPTGL MKDTHTVNCR SMSNKELFPP SPSEMGELEA 540 

TRQNQSTVAV HSHEPLLSDG AQLDFDALCD NDDTAMAAFM NYLEAEGGLG DPGDFSDIQW 600 
TIj 

Seq ID NO: 167 DNA sequence 
Nucleic Acid Accession #: NM_014400 
Coding sequence: 86-1126 

1 11 21 31 41 51 

I I I I I I 

GGTTACTCAT CCTGGGCTCA GGTAAGAGGG CCCGAGCTCG GAGGCGGCAC ACCCAGGGGG 60 

GACGCCAAGG GAGCAGGACG GAGCCATGGA CCCCGCCAGG AAAGCAGGTG CCCAGGCCAT 120 

GATCTGGACT GCAGGCTGGC TGCTGCTGCT GCTGCTTCGC GGAGGAGCGC AGGCCCTGGA 180 

GTGCTACAGC TGCGTGCAGA AAGCAGATGA CGGATGCTCC CCGAACAAGA TGAAGACAGT 240 

GAAGTGCGCG CCGGGCGTGG ACGTCTGCAC CGAGGCCGTG GGGGCGGTGG AGACCATCCA 300 

CGGACAATTC TCGCTGGCAG TGCSGGGTTG CGGTTCGGGA CTCCCCGGCA AGAATGACOG 360 

CGGCCTGGAT CTTCACGGGC TTCTGGCGTT CATCCAGCTG CAGCAATGCG CTCAGGATCG 420 

CTGCAACGCC AAGCTCAACC TCACCTCGCG GGCGCTCGAC CCGGCAGGTA ATGAGAGTGC 480 

ATACCCGCCC AACGGCGTGG AGTGCTACAG CTGTGTGGGC CTGAGCCGGG AGGCGTGCCA 540 

GGGTACATCG CCGCCGGTCG TGAGCTGCTA CAACGCCAGC GATCATGTCT ACAAGGGCTG 600 

CTTCGACGGC AACGTCACCT TGACGGCAGC TAATGTGACT GTGTCCTTGC CTGTCCGGGG 660 

CTGTGTCCAG GATGAATTCT GCACTCGGGA TGGAGTAACA GGCCCAGGGT TCACGCTCAG 720 

TGGCTCCTGT TGCCAGGGGT CCCGCTGTAA CTCTGACCTC CGCAACAAGA CCTACTTCTC 780 

. CCCTCGAATC CCACCCCTTG TCCGGCTGCC CCCTCCAGAG CCCACGACTG TGGCCTCAAC 840 

CACATCTGTC ACCACTTCTA CCTCGGCCCC AGTGAGACCC ACATCCACCA CCAAACCCAT 900 

GCCAGCGCCA ACCAGTCAGA CTCCGAGACA GGGAGTAGAA CACGAGGCCT CCCGGGATGA 960 

GGAGCCCAGG TTGACTGGAG GCGCCGCTGG CCACCAGGAC CGCAGCAATT CAGGGCAGTA 1020 

TCCTGCAAAA GGGGGGCCCC AGCAGCCCCA TAATAAAGGC TGTGTGGCTC CCACAGCTGG 1080 

ATTGGCAGCC CTTCTGTTGG CCGTGGCTGC TGGTGTCCTA CTGTGAGCTT CTCCACCTGG 1140 

AAATTTCCCT CTCACCTACT TCTCTGGCCC TGGGTACCCC TCTTCTCATC ACTTCCTGTT 1200 

CCCACCACTG GACTGGGCTG GCCCAGCCCC TGTTTTTCCA ACATTCCCCA GTATCCCCAG 1260 

CTTCTGCTGC GCTGGTTTGC GGCTTTGGGA AATAAAATAC CGTTGTATAT ATTCTGGCAG 1320 

GGGTGTTCTA GCTTTTTGAG GACAGCTCCT GTATCCTTCT CATCCTTGTC TCTCCGCTTG 1380 

TCCTCTTGTG ATGTTAGGAC AGAGTGAGAG AAGTCAGCTG TCACGGGGAA GGTGAGAGAG 1440 

AGGATGCTAA GCTTCCTACT CACTTTCTCC TAGCCAGCCT GGACTTTGGA GCGTGGGGTG 1500 

GGTGGGACAA TGGCTCCCCA CTCTAAGCAC TGCCTCCCCT ACTCCCCGCA TCTTTGGGGA 1560 

ATCGGTTCCC CATATGTCTT CCTTACTAGA CTGTGAGCTC CTCGAGGGCA GGGACCGTGC 1620 

CTTATGTCTG TGTGTGATCA GTTTCTGGCA CATAAATGCC TCAATAAAGA TTTAATTACT 1680 
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TTGTATAGTG AAAAAAAA 

Seq ID NO: 168 Protein sequence: 
Protein Accession #: NPJ) 55215 

1 li 21 31 41 SI 

I I I I I.I 

MDPARKAGAQ AMIWTAGWLL LLLLRGGAQA LECYSCVQKA DDGCSPNKMK TVKCAPGVDV 60 

CTEAVGAVET IHGQFSLAVX GCGSGLPGKN DRGLDLHGLL APIQLQQCAQ DRCNAKLNLT 120 

SRALDPAGNE SAYPPNGVEC YSCVGLSREA CQGTSPPWS CYNASDKVYR GCFDGNVTtT 180 

AANVTVSIiPV RGCVQDEFCT RDGVTGPGPT LSGSCCQGSR CNSDLRNKTY FSPRIPPLVR 240 

LPPPEPTTVA STTSVTTSTS APVRPTSTTK PMPAPTSQTP RQGVEHEASR DEEPRLTGGA 300 
AGHQDRSNSG QYPAKGGPQQ PHNKGCVAPT AGLAALLLAV AAGVLL 

Seq ID NO: 169 DNA sequence 
Nucleic Acid Accession #: NM_006875 
Coding sequence: 186-1190 

1 11 21 31 41 51 

I I I I I I 

GAATTCGGCA CGAGCGCGCG GCGAATCTCA ACGCTGCGCC GTCTGCGGGC GCTTCCGGGC 60 

CACCAGTTTC TCTGCTTTCC ACCCTGGCGC CCCCCAGCCC TGGCTCCCCA GCTGCGCTGC 120 

CCCGGGCGTC CACGCCCTGC GGGCTTAGCG GGTTCAGTGG GCTCAATCTG CGCAGCGCCA 180 

CCTCCATGTT GACCAAGCCT CTACAGGGGC CTCCCGCGCC CCCCGGGACC CCCACGCCGC 240 

CGCCAGGAGG CAAGGATCGG GAAGCGTTCG AGGCCGAGTA TCGACTCGGC CCCCTCCTGG 300 

GTAAGGGGGG CTTTGGCACC GTCTTCGCAG GACACCGCCT CACAGATCGA CTCCAGGTGG 360 

CCATCAAAGT GATTCCCCGG AATCGTGTGC TGGGCTGGTC CCCCTTGTCA GACTCAGTCA 420 

CATGCCCACT CGAAGTCGCA CTGCTATGGA AAGTGGGTGC AGGTGGTGGG CACCCTGGOG 480 

TGATCCGCCT GCTTGACTGG TTTGAGACAC AGGAAGGCTT CATGCTGGTC CTCGAGCGGC 540 

CTTTGCCCGC CCAGGATCTC TTTGACTATA TCACAGAGAA GGGCCCACTG GGTGAAGGCC 600 

CAAGCCGCTG CTTCTTTGGC CAAGTAGTGG CAGCCATCCA GCACTGCCAT TCCCGTGGAG 660 

TTGTCCATCG TGACATCAAG GATGAGAACA TCCTGATAGA CCTACGCCGT GGCTGTGCCA 720 

AACTCATTGA TTTTGGTTCT GGTGCCCTGC TTCATGATGA ACCCTACACT GACTTTGATG 780 

GGACAAGGGT GTACAGCCCC CCAGAGTGGA TCTCTCGACA CCAGTACCAT GCACTCCCGG 840 

CCACTGTCTG GTCACTGGGC ATCCTCCTCT ATGACATGGT GTGTGGGGAC ATTCCCTTTG 900 

AGAGGGACCA GGAGATTCTG GAAGCTGAGC TCCACTTCCC AGCCCATGTC TCCCCAGACT 960 

GCTGTGCCCT AATCCGCCGG TGCCTGGCCC CCAAACCTTC TTCCCGACCC TCACTGGAAG 1020 

AGATCCTGCT GGACCCCTGG ATGCAAACAC CAGCCGAGGA TGTTACCCCT CAACCCCTCC 1080 

AAAGGAGGCC CTGCCCCTTT GGCCTGGTCC TTGCTACCCT AAGCCTGGCC TGGCCTGGCC 1140 

TGGCCCCCAA TGGTCAGAAG AGCCATCCCA TGGCCATGTC ACAGGGATAG ATGGACATTT 1200 

GTTGACTTGG TTTTACAGGT CATTACCAGT CATTAAAGTC CAGTATTACT AAGGTAAGGG 1260 

ATTGAGGATC AGGGGTTAGA AGACATAAAC CAAGTTTGCC CAGTTCCCTT CCCAATCCTA 1320 

CAAAGGAGCC TTCCTCCCAG AACCTGTGGT CCCTGATTTT GGAGGGGGAA CTTCTTGCTT 1380 

CTCATTTTGC TAAGGAAGTT TATTTTGGTG AAGTTGTTCC CATTTTGAGC CCCGGGACTC 1440 

TTATTTTGAT GATGTGTCAC CCCACATTGG CACCTCCTAC TACCACCACA CAAACTTAGT 1500 

TCATATGCTT TTACTTGGGC AAGGGTGCTT TCCTTCCAAT ACCCCAGTAG CTTTTATTTT 1560 

AGTAAAGGGA CCCTTTCCCC TAGCCTAGGG TCCCATATTG GGTCAAGCTG CTTACCTGCC 1620 

TCAGCCCAGG ATTTTTTATT TTGGGGGAGG TAATGCCCTG TTGTTACCCC AAGGCTTCTT 1680 

TTTTTTTTTT TTTTTTTTTG GGTGAGGGGA CCCTACTTTG TTATCCCAAG TGCTCTTATT 1740 

CTGGTGAGAA GAACCTTAAT TCCATAATTT GGGAAGGAAT GGAAGATGGA CACCACCGGA 1800 

CACCACCAGA CAATAGGATG GGATGGATGG TTTTTTGGGG GATGGGCTAG GGGAAATAAG 1860 

GCTTGCTGTT TGTTTTCCTG GGGCGCTCCC TCCAATTTTG CAGATTTTTG CAACCTCCTC 1920 

CTGAGCCGGG ATTGTCCAAT TACTAAAATG TAAATAATCA CGTATTGTGG GGAGGGGAGT 1980 

TCCAAGTGTG CCCTCCTTTT TTTTCCTGCC TGGATTATTT AAAAAGCCAT GTGTGGAAAC 2040 
CCACTATTTA ATAAAAGTAA TAGAATCAGA AAAAAAAAAA AAAAAAAA 



Seq ID NO: 170 Protein sequence: 
Protein Accession fh NP_006866 

1 11 21 31 41 51 

I I I I I I 

MLTKPLQGPP APPGTPTPPP GGKDREAFEA EYRLGPLLGK GGFGTVFAGH RLTDRLQVAI 60 
KVIPRNRVLG WSPLSDSVTC PLEVALLWKV GAGGGHPGVI RLLDWFETQE GFMLVLERPL 120 
PAQDLFDYIT EKGPLGEGPS RCFFGQWAA IQHCHSRGW HRDIKDENIL IDLRRGCAKL 180 
IDFGSGALLH DEPYTDFDGT RVYSPPEWIS RHQYHALPAT VWSLGILLYD MVCGDIPFER 240 
DQEILEAELH FPAHVSPDCC AIiIRRCIAPK PSSRPSLEEI LLDPWMQTPA EDVTPQPLQR 300 
RPCPFGLVLA TLSLAWPGLA PNGQKSKPMA MSQG 

Seq ID NO: 171 DNA sequence 
Nucleic Acid Accession &: NM_003646 
Coding sequence: 89.. 2875 

1 11 21 31 41 51 

1111)1 

GCGGCGCGGA GCGGGCGTGC TGAGCCCCGG CCGCCGGCCC GGCATGGGCG TCTCCCGCGG 60 

GCCCTCCGCC GGCCGGGGCT AGGGCCGGAT GGAGCOGCGG GACGGTAGCC CCGAGGCCCG 120 

GAGCAGCGAC TCCGAGTCGG CTTCCGCCTC GTCCAGCGGC TCCGAGCGCG ACGCCGGTCC 180 

CGAGCCGGAC AAGGCGCCGC GGCGACTCAA CAAGCGGCGC TTCCCGGGGC TGCGGCTCTT 240 

CGGGCACAGG AAAGCCATCA CCAAGTCGGG CCTCCAGCAC CTGGCCCCCC CTCCGCCCAC 300 

CCCTGGGGCC CCGTGCAGCG AGTCAGAGCG GCAGATCCGG AGTACAGTGG ACTGGAGCGA 360 

GTCAGCGACA TATGGGGAGC ACATCTGGTT CGAGACCAAC GTGTCCGGGG ACTTCTGCTA 420 

CGTTGGGGAG CAGTACTGTG TAGCCAGGAT GCTGAAGTCA GTGTCTCGAA GAAAGTGCGC 480 

AGCCTGCAAG ATTGTGGTGC ACACGCCCTG CATCGAGCAG CTGGAGAAGA TAAATTTCCG 540 

CTGTAAGCCG TCCTTCCGTG AATCAGGCTC CAGGAATGTC CGCGAGCCAA CCTTTGTACG 600 

GCACCACTGG GTACACAGAC GACGCCAGGA CGGCAAGTGT CGGCACTGTG GGAAGGGATT 660 

CCAGCAGAAG TTCACCTTCC ACAGCAAGGA GATTGTGGCC ATCAGCTGCT CGTGGTGCAA 720 
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GCAGGCATAC CACAGCAAGG TGTCCTGCTT CATGCTGCAG CAGATCGAGG AGCCGTGCTC 780 

GCTGGGGGTC CACGCAGCCG TGGTCATCCC GCCCACCTGG ATCCTCCGCG CCOGOAGGCC 840 

CCAGAATACT CTGAAAGCAA GCAAGAAGAA GAAGAGGGCA TCCTTCAAGA GGAAGTCCAG 900 

CAAGAAAGGG CCTGAGGAGG GCCGCTGGAG ACCCTTCATC ATCAGGCCCA CCCCCTCCCC .960 

GCTCATGAAG CCCCTGCTGG TGTTTGTGAA CCCCAAGAGT GGGGGCAACC AGGGTGCAAA 1020 

GATCATCCAG TCTTTCCTCT GGTATCTCAA TCCCOGACAA GTCTTCGACC TGAGCCAGGG 1080 

AGGGCCCAAG GAGGCGCTGG AGATGTACCG CAAAGTGCAC AACCTGCGGA TCCTGGCGTG 1140 

CGGGGGCGAC GGCACGGTGG GCTGGATCCT CTCCACCCTG GACCAGCTAC GCCTGAAGCC 1200 

GCCACCCCCT GTTGCCATCC TGCCCCTGGG TACTGGCAAC GACTTGGCCC GAACCCTCAA 1260 

CTGGGGTGGG GGCTACACAG ATGAGCCTGT GTCCAAGATC CTCTCCCACG TGGAGGAGGG 1320 

GAACGTGGTA CAGCTGGACC GCTGGGACCT CCACGCTGAG CCCAACCCCG AGGCAGGGCC 1380 

TGAGGACCGA GATGAAGGCG CCACCGACCG GTTGCCCCTG GATGTCTTCA ACAACTACTT 1440 

CAGCCTGGGC TTTGACGCCC ACGTCACCCT GGAGTTCCAC GAGTCTCGAG AGGCCAACCC 1500 

AGAGAAATTC AACAGCCGCT TTCGGAATAA GATGTTCTAC GCOGGGACAG CTTTCTCTGA 1560 

CTTCCTGATG GGCAGCTCCA AGGACCTGGC CAAGCACATC CGAGTGGTGT GTGATGGAAT 1620 

GGACTTGACT CCCAAGATCC AGGACCTGAA ACCCCAGTGT GTTGTTTTCC TGAACATCCC 1680 

CAGGTACTGT GCGGGCACCA TGCCCTGGGG CCACCCTGGG GAGCACCACG ACTTTGAGCC 1740 

CCAGCGGCAT GACGACGGCT ACCTCGAGGT CATTGGCTTC ACCATGACGT CGTTGGCCGC 1800 

GCTGCAGGTG GGCGGACACG GCGAGCGGCT GACGCAGTGT CGCGAGGTGG TGCTCACCAC 1860 

ATCCAAGGCC ATCCCGGTGC AGGTGGATGG CGAGCCCTGC AAGCTTGCAG CCTCACGCAT 1920 

CCGCATCGCC CTGCGCAACC AGGCCACCAT GGTGCAGAAG GCCAAGCGGC GGAGCGCCGC 1980 

CCCCCTGCAC AGCGACCAGC AGCCGGTGCC AGAGCAGTTG CGCATCCAGG TGAGTCGCGT 2040 

CAGCATGCAC GACTATGAGG CCCTGCACTA CGACAAGGAG CAGCTCAAGG AGGCCTCTGT 2100 

GCCGCTGGGC ACTGTGGTGG TCCCAGGAGA CAGTGACCTA GAGCTCTGCC GTGCCCACAT 2160 

TGAGAGACTC CAGCAGGAGC CCGATGGTGC TGGAGCCAAG TCCCCGACAT GCCAGAAACT 2220 

GTCCCCCAAG TGGTGCTTCC TGGACGCCAC CACTGCCAGC CGCTTCTACA GGATCGACCG 2280 

AGCCCAGGAG CACCTCAACT ATGTGACTGA GATCGCACAG GATGAGATTT ATATCCTGGA 2340 

CCCTGAGCTG CTGGGGGCAT GGGCCOGGCC TGACCTCCCA ACCCCCACTT CCCCTCTCCC 2400 

CACCTCACCC TGCTCACCCA CGCCCCGGTC ACTGCAAGGG GATGCTGCAC CCCCTCAAGG 2460 

TGAAGAGCTG ATTGAGGCTG CCAAGAGGAA CGACTTCTGT AAGCTCCAGG AGCTGCACCG 2520 

AGCTGGGGGC GACCTCATGC ACCGAGACGA GCAGAGTCGC ACGCTCCTGC ACCACGCAGT 2580 

CAGCACTGGC AGCAAGGATG TGGTCCGCTA CCTGCTGGAC CACGCCCCCC CAGAGATCCT 2640 

TGATGCGGTG GAGGAAAACG GGGAGACCTG TTTGCACCAA GCAGCGGCCC TGGGCCAGCG 2700 

CACCATCTGC CACTACATCG TGGAGGCCGG GGCCTCGCTC ATGAAGACAG ACCAGCAGGG 2760 

CGACACTCCC CGGCAGCGGG CTGAGAAGGC TCAGGACACC GAGCTGGCCG CCTACCTGGA 2820 

GAACCGGCAG CACTACCAGA TGATCCAGCG GGAGGACCAG GAGACGGCTG TGTAGCGGGC 2880 



Seq ID NO: 172 Protein sequence: 
Protein Accession #: NP_003637 

1 " 11 21 31 41 51 

I 1 I I I 1 

MEPRDGSPEA RSSDSESASA SSSGSERDAG PEPDKAPRRL NKRRFPGLRL FGHRKAITKS 60 

GLQHLAPPPP TPGAPCSESE RQIRSTVDWS ESATYGEHIW FETNVSGDFC YVGEQYCVAR 120 

MLKSVSRRKC AACKIWHTP CIEQLEKINF RCKPSFRESG SRNVREPTFV RHHWVHRRRQ 180 

DGKCRHCGKG FQQKFTFHSK EIVAISCSWC KQAYHSKVSC FMLQQIEEPC SLGVHAAWI 240 

PPTVJILRARR PQNTLKASKK KKRASFKRXS SKKGPEEGRW RPFIIRPTPS PLMKPLLVFV 300 

NPKSGGNQGA KIIQSPLWYL NPRQVFDLSQ GGPKEALEMY RKVHNLRILA CGGDGTVGWI 360 

LSTLDQLRLK PPPPVAILPL GTGNDLARTL NWGGGYTDEP VSKILSHVEE GNWQLDRWD 420 

LHAEPNPEAG PEDRDEGATD RLPLDVFNNY FSLGFDAHVT LEFHESREAN PEKFNSRFRN 4 80 

KMFYAGTAFS DFLMGSSKDL AKHIRWCDG MDLTPKIQDL KPQCWFLNI PRYCAGTMPW 540 

GHPGEHHDFE PQRHDDGYLE VIGFTMTSLA ALQVGGHGER LTQCREWLT TSKAIPVQVD 600 

GEPCKLAASR IRIALRNQAT MVQKAKRRSA APLHSDQQPV PEQLRIQVSR VSMHDYEALH 660 

YDKEQLKEAS VPLGTWVPG D9DLELCRAH IERLQQEPDG AGAXSPTCQK LSPKVJCFLDA 720 

TTASRFYRID RAQEHLNYVT EIAQDEIYIL DPELLGASAR PDLPTPTSPL PTSPCSPTPR 780 

SLQGDAAPPQ GEELIEAAKR NDFCKLQBLH RAGGDLMHRD EQSRTLLHHA VSTGSKDWR 840 

YLLDHAPPEI LDAVEENGET CLHQAAALGQ RTICHYIVEA GASLMKTDQQ GDTPRQRAEK 900 
AQDTELAAYL ENRGHYQMIQ REDQETAV 



Seq ID NO: 173 DNA sequence 
Nucleic Acid Accession #: AF232772 
Coding sequence: 1-1662 



1 11 21 31 41 51 

I I I I I I 

ATCCCGGTGC AGCTGACGAC AGCCCTGCGT GTGGTGGGCA CCAGCCTGTT TGCCCTGGCA 60 

GTGCTGGGTG GCATCCTGGC AGCCTATGTG ACGGGCTACC AGTTCATCCA CACGGAAAAG 120 

CACTACCTGT CCTTCGGCCT GTACGGCGCC ATCCTGGGCC TGCACCTGCT CATTCAGAGC 180 

CTTTTTGCCT TCCTGGAGCA CCGGCGCATG CGACGTGCCG GCCAGGCCCT GAAGCTGCCC 240 

TCCCCGCGGC GGGGCTCGGT GGCACTGTGC ATTGCCGCAT ACCAGGAGGA CCCTGACTAC 300 

TTGCGCAAGT GCCTGCGCTC GGCCCAGCGC ATCTCCTTCC CTGACCTCAA GGTGGTCATG 360 

GTGGTGGATG GCAACCGCCA GGAGGACGCC TACATGCTGG ACATCTTCCA CGAGGTGCTG '420 

GGCGGCACCG AGCAGGCCGG CTTCTTTGTG TGGCGCAGCA ACTTCCATGA GGCAGGCGAG 480 

GGTGAGACGG AGGCCAGCCT GCAGGAGGGC ATGGACCGTG TGOGGGATGT GGTGCGGGCC 540 

AGCACCTTCT CGTGCATCAT GCAGAAGTGG GGAGGCAAGC GCGAGGTCAT GTACACGGCC 600 

TTCAAGGCCC TCGGCGATTC GGTGGACTAC ATCCAGGTGT GCGACTCTGA CACTGTGCTG 660 

GATCCAGCCT GCACCATCGA GATGCTTCGA GTCCTGGAGG AGGATCCCCA AGTAGGGGGA 720 

GTCGGGGGAG ATGTCCAGAT CCTCAACAAG TACGACTCAT GGATTTCCTT CCTGAGCAGC 780 

GTGCGGTACT GGATGGCCTT CAACGTGGAG CGGGCCTGCC AGTCCTACTT TGGCTGTGTG 840 

CAGTGTATTA GTGGGCCCTT GGGCATGTAC CGCAACAGCC TCCTCCAGCA GTTCCTGGAG 900 

GACTGGTACC ATCAGAAGTT CCTAGGCAGC AAGTGCAGCT TCGGGGATGA CCGGCACCTC 960 

ACCAACCGAG TCCTGAGCCT TGGCTACCGA ACTAAGTATA CCGCGCGCTC CAAGTGCCTC 1020 

ACAGAGACCC CCACTAAGTA CCTCCGGTGG CTCAACCAGC AAACCCGCTG GAGCAAGTCT 1080 

TACTTCCGGG AGTGGCTCTA CAACTCTCTG TGGTTCCATA AGCACCACCT CTGGATGACC 1140 

TACGAGTCAG TGGTCACGGG TTTCTTCCCC TTCTTCCTCA TTGCCACGGT TATACAGCTT 1200 

TTCTACCGGG GCCGCATCTG GAACATTCTC CTCTTCCTGC TGACGGTGCA GCTGGTGGGC 1260 

ATTATCAAGG CCACCTACGC CTGCTTCCTT CGGGGCAATG CAGAGATGAT CTTCATGTCC 1320 
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CTCTACTCCC TCCTCTATAT GTCCAGCCTT CTGCCGGCCA AGATCTTTGC CATTGCTACC 1380 

ATCAACAAAT CT6QCTG6GG CACCTCTGGC CGAAAAACCA TTGTGGTGAA CTTCATTGGC 1440 

CTCATTCCTG TGTCCATCTG GGTGGCAGTT CTCCTGGAGG GGCTGGCCTA CACAQCTTAT 1500 

TGCCAGGACC TGTTCAGTGA GACAGAGCTA GCCTTCCTTG .TCTCTGGGGC TATACTGTAT 1560 

GGCTGCTACT GGGTGGCCCT CCTCATGCTA TATCTGGCCA TCATCGCCCG GCGATGTGGG 1620 

AAGAAGCCGG AGCAGTACAG CTTGGCTTTT GCTGAGGTGT GACATGGCCC CCAAGCAGAG 1680 

CGGGTAAAGT GCAATGGGTA AGGGAGGGAA GGGGAATGGA AGAGAAAAGA CAGGGTGGGA 1740 

GGGAGGAGGG AGTGCTGTGT TTTAGTCTCT TAATGGTCCA AAGGACAAAT CTAAAATGCA 1800 

AAGAACGGTG ATGTAGTATG GCCTGACAGC TCTGTTTAGA GGAGGCAACA CTGATCCCCC 1860 

AGATGCAGGG CTGCAGGGGA TTCTGTGTTT TCAGACTGCC TGTCTGCTTG CATCTGCACA 1920 

TAGGCAGTAG CCTCCTCCTG GGCTCCAGAG GGCACTCAGA AGTTGTGCTA AACCAAGTTA 19B0 

AGTCCCATTC AGTGGCAACT TGTGATAGGT ACCTGAGTGA CGGCAACCTG CGGAAGGAGG 2040 

TTCTCCCAGC CCATCTGAAC ACAACCAGAG GTGGCAGGAG AATTTCTACT GAGCGAGGTG 2100 

GGCCGGTTAG TGTATGTCAC CCCCACCCCA CCCATAAGTA GTCATCAATG CAATAAGATT 2160 

GCGCGTGAGA TACAAGGCCC AGAAGCCTGA TCTTTGGGCA TCAGAAAACA GGGTCCAGGA 2220 

ATGGTGCTTT ATGTGAGATA CCCCACTCCA CATCAACATT CCAGGGATGA GCCAAACCAG 2280 

CAGGGAGTTA GCACTGAACT GCTTTTAAAA GTGCACATTA AAAAGGAAAG TTTGCCAGGA 2340 

GGAACAAAGA GATTGTGGTG GTGCTAAAGG AGGCCATAAG CTACACAGAG GCCTTGGGTG 2400 

TTCCACCTGG AAACTGCTCA GACGTCTAGA TGGGTTCTTA GCTTGTCTGT GATCTCTGCT 2460 

GGGGAGATAA AAAGATTAAG CCCCAACATG TTCAGAAAAG AAGTGAAGTC TTGGGTATTT 2520 

TAACCTGTAT ACTCTTGAAT TCCTCTCAAA TTCAGCTCTG ATCTGAGGCT AAGACACACT 2580 

CCCCACTTCA CTTTCTTCAA AGCCACATTT TTTGAGGTAT CACTGCAGTC ACCTCTTCTA 2640 

CCCTCATCAT CATAGGTAAG GTTTTCAAGG TGGCAATTGG GGCGGAGCCC CGGCTTCTTA 2700 

TAGAAGCTTC AGCAGGAGGC AAGCGTGTTC TCAGCACATA TGGGAACTAT GAGGAGCCTC 2760 

TGATCAAATT GGCTACAATC TTGGAGCTGC TTGGACGGAT TCCTTGGCAG CCGGGTTAGC 2820 

ATGTGTGACT TTCAGGCTAC TGTTCTTGAC AATCATCTCC AATGGAAAGC TTTTCAGTGT 2880 

TCCCAAAGTG AACTCTCAAA TCCAAAATGG TTATCTTTGA GACCATCCAT TCTCCTCAGT 2940 

GGCTTCTCCA GGGAATTCTT ACAGCCAAGT TGTGACAGTC ACTGCATTTG CCTGCTTCTT 3000 

TCCAGAAACC AAACTAGGAG ATGAAACTGG TTCCTACATC CTAAGGTTCT TGCTTTCTCT 3060 

CTCATGCCTC CTGAGGCTGT TTTTGGCTGT TTTCCCTCTG CTGCTTTTGG GGAATGAGGG 3120 

GAAGCCATTT TCCAAGTGAC TTGCAATCCA GGCTGTTCTC AGCGTTTTGA GTTTA AAAC C 3180 

TGGGATCCTG ACTAAGCCTT TGACTTAAGG GTTGCTTGCT TGCCCTCCAA ATGTCCTTTC 3240 

TCAAAGGGGC CAACTAACCC GTGCAGAACC AGCACTAAGG TGGACAGCAG ACAAGAGGGC 3300 

AAGCCTCTAA TGTACCAAGT GCTTCCTACA AAGACGCAAG GTGTGCTCCG AACCACAGAT 3360 

GGGCAAACCC TGGTGCTTTC CTTCATCTCC CACGAACTCA AGGGTTTTCC AAGTGTAGCT 3420 

AACAGTTGCC ACATCACACA GACCTCCAGT TTCTGGTAAG ACTGCTGGTT GACATCAGAC 3480 

CCAACCCATT GAAGGCTGGA AGGCAGCAGG CATTTGCTAA GGCAGCTGAT CCAGGCAATC 3540 

GTTCTGCTGG CCAAGAAGTT AAACTATTTT GAGCATTAGA ATGGAGGAAA TCCGGTCAGC 3600 

CAAGTGCAGA GTTCAGACTT CGCTAAGGGC TTGTTTTTCT TCAGCATTTA CTTGAAGATT 3660 

AATGTAGGAT GACAGGCTCT CCTGGCTGTC CTACCATCAG CTCTGCCTTG CACTGTGGTC 3720 

GTCAACTTTC CTCAAATCAA AAACAGGCAG GTACAGGTAG TGGGCTCACA ACGTTTGACC 378Q 

TCGACTGGTT TTTCTAAGTT ATTTTGTACA TTTTTCAGCA GCAAAACCAA ACTGGGTCTT 3840 

CAGCTTTATC CCCGTTTCTT GCAAGGGAAG AGCCTTTATA CAATTGGACG CATTTTGGTT 3900 

TTTCCTCATT GAGAATTCAA ATCCTCTTTT GTATTGTTTC TACAATAATT TGTAAACATA 3960 

TTTATTTTTA CCTGCTTTTT TTTTTTTTTT TAATTTTCAG GTCAAGTTTT TTATACTGCA 4020 
CTTATTTGTC AAAATAAAGA TTCTCACAT 

Seq ID NO: 174 Protein sequence: 
Protein Accession #: AAF36984 

1 11 21 31 41 51 

I I I I I I 

MPVQLTTALR WGTSLPALA VLGGILAAYV TGYQPIHTEK HYLSFGLYGA ILGLHLLIQS 60 

LFAFLEHRRM RRAGQALKLP SPRRGSVALC 1AAYQEDPDY LRKCLRSAQR ISFPDLKWM 120 

WDGNRQEDA YMLDIFHEVL GGTEQAGFFV WRSNFHEAGE GETEASLQEG MDRVRDWRA 180 

STFSCIMQKW GGKREVMYTA FKALGDSVDY IQVCDSDTVL DPACTIEMLR VLEEDPQVGG 240 

VGGDVQILNK YDSWISFLSS VRYWMAFNVE RACQSYFGCV QCISGPLGMY RNSLLQQFLE 300 

DWYHQKFLGS KCSFGDDRHL TNRVLSLGYR TKYTARSKCL TETPTKYLRW LNQQTRWSKS 360 

YFREWLYNSL WFHKHHLWMT YESWTGFFP FFLIATVIQL FYRGRIWNIL LFLLTVQLVG 420 

IIKATYACFIi RGNAEMIFMS LYSLLYMSSL LPAKIFAIAT INKSGWGTSG RKTIWNFIG 480 

LIPVSIWVAV LLEGLAYTAY CQDLFSETEL AFLVSGAILY GCYWVALLML YLAIIARRCG 540 
KKPEQYSLAF AEV 

Seq ID NO: 175 DNA sequence 
Nucleic Acid Accession #: NM_00C691 
Coding sequence: 43.. 1404 

1 11 21 31 41 51 

| 1 I 1 I I 

CCAGGAGCCC CAGTTACCGG GAGAGGCTGT GTCAAAGGCG CCATGAGCAA GATCAGCGAG 60 

GCCGTGAAGC GCGCCCGCGC CGCCTTCAGC TCGGGCAGGA CCCGTCCGCT GCAGTTCCGA 120 

TTCCAGCAGC TGGAGGCGCT GCAGCGCCTG ATCCAGGAGC AGGAGCAGGA GCTGGTGGGC 180 

GCGCTGGCCG CAGACCTGCA CAAGAATGAA TGGAACGCCT ACTATGAGGA GGTGGTGTAC 240 

GTCCTAGAGG AGATCGAGTA CATGATCCAG AAGCTCCCTG AGTGGGCCGC GGATGAGCCC 300 

GTGGAGAAGA CGCCCCAGAC TCAGCAGGAC GAGCTCTACA TCCACTCGGA GCCACTGGGC 360 

GTGGTCCTCG TCATTGGCAC CTGGAACTAC CCCTTCAACC TCACCATCCA GCCCATGGTG 420 

GGCGCCATCG CTGCAGGGAA CGCAGTGGTC CTCAAGCCCT CGGAGCTGAG TGAGAACATG 480 

GCGAGCCTGC TGGCTACCAT CATCCCCCAG TACCTGGACA AGGATCTGTA CCCAGTAATC 540 

AATGGGGGTG TCCCTGAGAC CACGGAGCTG CTCAAGGAGA GGTTCGACCA TATCCTGTAC 600 

ACGGGCAGCA CGGGGGTGGG GAAGATCATC ATGACGGCTG CTGCCAAGCA CCTGACCCCT 660 

GTCACGCTGG AGCTGGGAGG GAAGAGTCCC TGCTACGTGG ACAAGAACTG TGACCTGGAC 720 

GTGGCCTGCC GACGCATCGC CTGGGGGAAA TTCATGAACA GTGGCCAGAC CTGCGTGGCC 780 

CCAGACTACA TCCTCTGTGA CCCCTCGATC CAGAACCAAA TTGTGGAGAA GCTCAAGAAG 840 

TCACTGAAAG AGTTCTACGG GGAAGATGCT AAGAAATCCC GGGACTATGG AAGAATCATT 900 

AGTGCCCGGC ACTTCCAGAG GGTGATGGGC CTGATTGAGG GCCAGAAGGT GGCTTATGGG 960 

GGCACCGGGG ATGCCGCCAC TCGCTACATA GCCCCCACCA TCCTCACGGA CGTGGACCCC 1020 
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CAGTCCCCGG TGATGCAAGA GGAGATCTTC GGGCCTGTGC TGCCCATCGT GTGCGTGCGC 1080 

AGCCTGGAGG AGGCCATCCA GTTCATCAAC CAGCGTGAGA AGCCCCTGGC CCTCTACATG 1140 

TTCTCCAGCA ACGACAAGGT GATTAAGAAG ATGATTGCAG AGACATCCAG TGGTGGGGTG 1200 

GCGGCCAAOG ATGTCATCGT CCACATCACC TTGCACTCTC TGCCCTTCGG GGGCGTGGGG 1260 

AACAGCGGCA TGGGATCCTA CCATGGCAAG AAGAGCTTCG AGACTTTCTC TCACCGCOGC 1320 

TCTTGCCTGG TGAGGCCTCT GATGAATGAT GAAGGCCTGA AGGTCAGATA CCCCCCGAGC 1380 

CCGGCCAAGA TGACCCAGCA CTGAGGAGGG GTTGCTCCGC CTGGCCTGGC CATACTGTGT 1440 

CCCATCGGAG TGCGGACCAC CCTCACTGGC TCTCCTGGCC CTGGAGAATC GCTCCTGCAG 1500 

CCCCAGCCCA GCCCCACTCC TCTGCTGACC TGCTGACCTG TGCACACCCC ACTCCCACAT 1560 

GGGCCCAGGC CTCACCATTC CAAGTCTCCA CCCCTTTCTA GACCAATAAA GAGACAAATA 1620 
CAATTTTCTA ACTCGG 

Seq ID NO: 176 Protein sequence! 
Protein Accession 8: NP_000682 

1 11 21 31 41 51 

I I I I I I 

MSKISEAVKR ARAAPSSGRT RPLQFRFQQL EALQRLIQEQ EQELVGALAA DLHKNEWNAY 60 
YEEWYVLEE IBYMIQKLPE WAADEPVEKT PQTQQDELYI HSEPLGWLV IGTWNYPFNL 120 
TIQPMVGAIA AGNAWIiKPS EbSENMASLI* ATIIPQYLDK DLYPVINGGV PETTELLKER 180 
FDHILYTGST GVGKI IMTAA AKHLTPVTLE LGGKSPCYVD KNCDLDVACR RIAWGKFMNS 240 
GQTCVAPDYI LCDPSIQNQI VEKLKKSLKE PYGEDAKKSR DYGRIISARH FQRVMGLIEG 300 
QKVAYGGTGD AATRYIAPTI LTDVDPQSPV MQEEIFGPVL PIVCVRSLEE AIQFINQREK 3 60 
PLALYMFSSN DKVIKKMIAE TSSGGVAAND VIVHITLHSL PFGGVGNSGM GSYHGKKSFE 420 
TFSHRRSCLV RPLMNDEGLK VRYPPSPAKM TQH 

Seq ID NO i 177 DNA sequence 

Nucleic Acid Accession #: NM_0Ol067.1 

Coding sequence: 108-4703 

1 11 21 31 41 51 

CTAACCGACG CGCGTCTGTG GAGAAGCGGC TTGGTCGGGG GTGGTCTCGT GGGGTCCTGC 60 
CTGTTTAGTC GCTTTCAGGG TTCTTGAGCC CCTTCACGAC CGTCACCATG GAAGTGTCAC 120 
CATTGCAGCC TGTAAATGAA AATATGCAAG TCAACAAAAT AAAGAAAAAT GAAGATGCTA 180 
AGAAAAGACT GTCTGTTGAA AGAATCTATC AAAAGAAAAC ACAATTGGAA CATATTTTGC 240 
TCCGCCCAGA CACCTACATT GGTTCTGTGG AATTAGTGAC CCAGCAAATG TGGGTTTACG 300 
ATGAAGATGT TGGCATTAAC TATAGGGAAG TCACTTTTGT TCCTGGTTTG TACAAAATCT 360 
TTGATGAGAT TCTAGTTAAT GCTGCGGACA ACAAACAAAG GGACCCAAAA ATGTCTTGTA 420 
TTAGAGTCAC AATTGATCCG GAAAACAATT TAATTAGTAT ATGGAATAAT GGAAAAGGTA 480 
TTCCTGTTGT TGAACACAAA GTTGAAAAGA TGTATGTCCC AGCTCTCATA TTTGGACAGC 540 
TCCTAACTTC TAGTAACTAT GATGATGATG AAAAGAAAGT GACAGGTGGT CGAAATGGCT 600 
ATGGAGCCAA ATTGTGTAAC ATATTCAGTA CCAAATTTAC TGTGGAAACA GCCAGTAGAG 660 
AATACAAGAA AATGTTCAAA CAGACATGGA TGGATAATAT GGGAAGAGCT GGTGAGATGG 720 
AACTCAAGCC CTTCAATGGA GAAGATTATA CATGTATCAC CTTTCAGCCT GATTTGTCTA 780 
AGTTTAAAAT GCAAAGCCTG GACAAAGATA TTGTTGCACT AATGGTCAGA AGAGCATATG 840 
ATATTGCTGG ATCCACCAAA GATGTCAAAG TCTTTCTTAA TGGAAATAAA CTGCCAGTAA 900 
AAGGATTTCG TAGTTATGTG GACATGTATT TGAAGGACAA GTTGGATGAA ACTGGTAACT 960 

CCTTGAAAGT AATACATGAA CAAGTAAACC ACAGGTGGGA AGTGTGTTTA ACTATGAGTG 1020 

AAAAAGGCTT TCAGCAAATT AGCTTTGTCA ACAGCATTGC TACATCCAAG GGTGGCAGAC 1080 

ATGTTGATTA TGTAGCTGAT CAGATTGTGA CTAAACTTGT TGATGTTGTG AAGAAGAAGA 1140 

ACAAGGGTGG TGTTGCAGTA AAAGCACATC AGGTGAAAAA TCACATGTGG ATTTTTGTAA 1200 

ATGCCTTAAT TGAAAACCCA ACCTTTGACT CTCAGACAAA AGAAAACATG ACTTTACAAC 1260 

CCAAGAGCTT TGGATCAACA TGCCAATTGA GTGAAAAATT TATCAAAGCT GCCATTGGCT 1320 

GTGGTATTGT AGAAAGCATA CTAAACTGGG TGAAGTTTAA GGCCCAAGTC CAGTTAAACA 1380 

AGAAGTGTTC AGCTGTAAAA CATAATAGAA TCAAGGGAAT TCCCAAACTC GATGATGCCA 1440 

ATGATGCAGG GGGCCGAAAC TCCACTGAGT GTACGCTTAT CCTGACTGAG GGAGATTCAG 1500 

CCAAAACTTT GGCTGTTTCA GGCCTTGGTG TGGTTGGGAG AGACAAATAT GGGGTTTTCC 1560 

CTCTTAGAGG AAAAATACTC AATGTTCGAG AAGCTTCTCA TAAGCAGATC ATGGAAAATG 1620 

CTGAGATTAA CAATATCATC AAGATTGTGG GTCTTCAGTA CAAGAAAAAC TATGAAGATG 1680 

AAGATTCATT GAAGACGCTT CGTTATGGGA AGATAATGAT TATGACAGAT CAGGACCAAG 1740 

ATGGTTCCCA CATCAAAGGC TTGCTGATTA ATTTTATCCA TCACAACTGG CCCTCTCTTC 1800 

TGCGACATCG TTTTCTGGAG GAATTTATCA CTCCCATTGT AAAGGTATCT AAAAACAAGC 1860 

AAGAAATGGC ATTTTACAGC CTTCCTGAAT TTGAAGAGTG GAAGAGTTCT ACTCCAAATC 1920 

ATAAAAAATG GAAAGTCAAA TATTACAAAG GTTTGGGCAC CAGCACATCA AAGGAAGCTA 1980 

AAGAATACTT TGCAGATATG AAAAGACATC GTATCCAGTT CAAATATTCT GGTCCTGAAG 2040 

ATGATGCTGC TATCAGCCTG GCCTTTAGCA AAAAACAGAT AGATGATCGA AAGGAATGGT 2100 

TAACTAATTT CATGGAGGAT AGAAGACAAC GAAAGTTACT TGGGCTTCCT GAGGATTACT 2160 

TGTATGGACA AACTACCACA TATCTGACAT ATAATGACTT CATCAACAAG GAACTTATCT 2220 

TGTTCTCAAA TTCTGATAAC GAGAGATCTA TCCCTTCTAT GGTGGATGGT TTGAAACCAG 2280 

GTCAGAGAAA GGTTTTGTTT ACTTGCTTCA AACGGAATGA CAAGCGAGAA GTAAAGGTTG- 2340 

CCCAATTAGC TGGATCAGTG GCTGAAATGT CTTCTTATCA TCATGGTGAG ATGTCACTAA 2400 

TGATGACCAT TATCAATTTG GCTCAGAATT TTGTGGGTAG CAATAATCTA AACCTCTTGC 2460 

AGCCCATTGG TCAGTTTGGT ACCAGGCTAC ATGGTGGCAA GGATTCTGCT AGTCCACGAT 2520 

ACATCTTTAC AATGCTCAGC TCTTTGGCTC GATTGTTATT TCCACCAAAA GATGATCACA 2580 

CGTTGAAGTT TTTATATGAT GACAACCAGC GTGTTGAGCC TGAATGGTAC ATTCCTATTA 2640 

TTCCCATGGT GCTGATAAAT GGTGCTGAAG GAATCGGTAC TGGGTGGTCC TGCAAAATCC 2700 

CCAACTTTGA TGTGCGTGAA ATTGTAAATA ACATCAGGCG TTTGATGGAT GGAGAAGAAC 2760 

CTTTGCCAAT GCTTCCAAGT TACAAGAACT TCAAGGGTAC TATTGAAGAA CTGGCTCCAA 2820 

ATCAATATGT GATTAGTGGT GAAGTAGCTA TTCTTAATTC TACAACCATT GAAATCTCAG 2880 

AGCTTCCCGT CAGAACATGG ACCCAGACAT ACAAAGAACA AGTTCTAGAA CCCATGTTGA 2940 

ATGGCACCGA GAAGACACCT CCTCTCATAA CAGACTATAG GGAATACCAT ACAGATACCA 3000 

CTGTGAAATT TGTTGTGAAG ATGACTGAAG AAAAACTGGC AGAGGCAGAG AGAGTT GGAC 3060 

TACACAAAGT CTTCAAACTC CAAACTAGTC TCACATGCAA CTCTATGGTG CTTTTTGACC 3120 

ACGTAGGCTG TTTAAAGAAA TATGACACGG TGTTGGATAT TCTAAGAGAC TTTTTTGAAC 3180 

TCAGACTTAA ATATTATGGA TTAAGAAAAG AATGGCTCCT AGGAATGCTT GGTGCTGAAT 3240 

CTGCTAAACT GAATAATCAG GCTCGCTTTA TCTTAGAGAA AATAGATGGC AAAATAATCA 3300 
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TTGAAAATAA GCCTAAGAAA GAATTAATTA AAGTTCTGAT TCAGAGGGGA TATGATTCGG 3360 

ATCCTGTGAA GGCCTGGAAA GAAGCCCAGC AAAAGGTTCC AQATGAAGAA GAAAATGAAG 3420 

AGAGTGACAA 06AAAAGGAA ACTGAAAA6A GTGACTCOQT AACAGATTCT GQACCAACCT 3480 

TCAACTATCT TCTTGATATG CCCCTTTGGT ATTTAACCAA GGAAAAGAAA GATGAACTCT 3540 

GCAGGCTAAO AAATGAAAAA GAACAAGAGC TGGACACATT AAAAAGAAAG AGTCCATCAG 3600 

ATTTGTGGAA AGAAGACTTG GCTACATTTA TTGAAGAATT GGAGGCTGTT GAAGCCAAGG 3660 

AAAAACAAGA TGAACAAGTC GGACTTCCTG GGAAAGGGGG GAAGGCCAAG GGGAAAAAAA 3720 

CACAAATGGC TGAAGTTTTG CCTTCTCOGC GTGGTCAAAG AGTCATTCCA CGAATAACCA 3780 

TAGAAATGAA AGCAGAGGCA GAAAAGAAAA ATAAAAAGAA AATTAAGAAT GAAAATACTG 3840 

AAGGAAGCCC TCAAGAAGAT GGTGTGGAAC TAGAAGGCCT AAAACAAAGA TTAGAAAAGA 3900 

AACAGAAAAG AGAACCAGGT ACAAAGACAA AGAAACAAAC TACATTGGCA TTTAAGCCAA 3960 

TCAAAAAAGG AAAGAAGAGA AATCCCTGGC CTGATTCAGA ATCAGATAGG AGCAGTGAOG 4020 

AAAGTAATTT TGATGTCCCT CCACGAGAAA CAGAGCCACX5 GAGAGCAGCA ACAAAAACAA 4080 

AATTCACAAT GGATTTGGAT TCAGATGAAG ATTTCTCAGA TTTTGATGAA AAAACTGATG 4140 

ATGAAGATTT TGTCCCATCA GATGCTAGTC CACCTAAGAC CAAAACTTCC CCAAAACTTA 4200 

GTAACAAAGA ACTGAAACCA CAGAAAAGTG TCGTGTCAGA CCTTGAAGCT GATGATGTTA 4260 

AGGGCAGTGT ACCACTGTCT TCAAGCCCTC CTGCTACACA TTTCCCAGAT GAAACTGAAA 4320 

TTACAAACCC AGTTCCTAAA AAGAATGTGA CAGTGAAGAA GACAGCAGCA AAAAGTCAGT 4380 

CTTCCACCTC CACTACCGGT GCCAAAAAAA GGGCTGCCCC AAAAGGAACT AAAAGGGATC 4440 

CAGCTTTGAA TTCTGGTGTC TCTCAAAAGC CTGATCCTGC CAAAACCAAG AATCGCCGCA 4500 

AAAGGAAGCC ATCCACTTCT GATGATTCTG ACTCTAATTT TGAGAAAATT GTTTCGAAAG 4560 

CAGTCACAAG CAAGAAATCC AAGGGGGAGA GTGATGACTT CCATATGGAC TTTGACTCAG 4620 

CTGTGGCTCC TCGGGCAAAA TCTGTACGGG CAAAGAAACC TATAAAGTAC CTGGAAGAGT 4680 

CAGATGAAGA TGATCTGTTT TAAAATGTGA GGCGATTATT TTAAGTAATT ATCTTACCAA 4740 

GCCCAAGACT GGTTTTAAAG TTACCTGAAG CTCTTAACTT CCTCCCCTCT GAATTTAGTT 4800 

TGGGGAAGGT GTTTTTAGTA CAAGACATCA AAGTGAAGTA AAGCCCAAGT GTTCTTTAGC 4860 

TTTTTATAAT ACTGTCTAAA TAGTGACCAT CTCATGGGCA TTGTTTTCTT CTCTGCTTTG 4920 

TCTGTGTTTT GAGTCTGCTT TCTTTTGTCT TTAAAACGTG ATTTTTAAGT TCTTCTGAAC 4 9 BO 

TGTAGAAATA GCTATCTGAT CACTTCAGCG TAAAGCAGTG TGTTTATTAA CCATCCACTA 5040 

AGCTAAAACT AGAGCAGTTT GATTTAAAAG TGTCACTCTT CCTCCTTTTC TACTTTCAGT 5100 

AGATATGAGA TAGAGCATAA TTATCTGTTT TATCTTAGTT TTATACATAA TTTACCATCA 5160 

GATAGAACTT TATGGTTCTA GTACAGATAC TCTACTACAC TCAGCCTCTT ATGTGCCAAG 5220 

TTTTTCTTTA AGCAATGAGA AATTGCTCAT GTTCTTCATC TTCTCAAATC ATCAGAGGCC 5280 

AAAGAAAAAC ACTTTGGCTG TGTCTATAAC TTGACACAGT CAATAGAATG AAGAAAATTA 5340 

GAGTAGTTAT GTGATTATTT CAGCTCTTGA CCTGTCCCCT CTGGCTGCCT CTGAGTCTGA 5400 

ATCTCCCAAA GAGAGAAACC AATTTCTAAG AGGACTGGAT TGCAGAAGAC TCGGGGACAA 5460 

CATTTGATCC AAGATCTTAA ATGTTATATT GATAACCATG CTCAGCAATG AGCTATTAGA 5520 

TTCATTTTGG GAAATCTCCA TAATTTCAAT TTGTAAACTT TGTTAAGACC TGTCTACATT 5580 

GTTATATGTG TGTGACTTGA GTAATGTTAT CAACGTTTTT GTAAATATTT ACTATGTTTT 5640 
TCTATTAGCT AAATTCCAAC AATTTTGTAC TTTAATAAAA TGTTCTAAAC ATTGC 

Seq ID NO: 178 Protein sequence: 
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1 11 21 31 41 51 
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MEVSPLQPVN ENMQVNKIKK NEDAKKRLSV ERIYQKKTQL EHILLRPDTY IGSVELVTQQ 60 
MWVYDEDVGI NYREVTFVPG LYKIFDEILV NAADNKQRDP KMSCIRVTID PENNLISIWN 120 
NGKGIPWEH KVEKMYVPAL IFGQLLTSSN YDDDEKKVTG GRNGYGAKLC NIFSTKFTVE 180 
TASREYKKMF KQTWMDNMGR AGEMELKPFN GEDYTCITFQ PDLSKFKMQS LDKDIVALMV 240 
RRAYDIAGST KDVKVFLNGN KLPVKGFRSY VDMYLKDKLD ETGNSLKVIH EQVNHRWEVC 300 
LTMSEKGPQQ ISFVNSIATS KGGRHVDYVA DQIVTKLVDV VKKKNKGGVA VKAHQVKNHM 360 
WIPVN ALIEN PTFDSQTKEN MTLQPKSFGS TCQLSEKFIK AAIGCGIVES ILNWVKFKAQ 420 
VQLNKKCSAV KHNRIKGIPK LDDANDAGGR NSTECTLILT EGDSAKTLAV SGLGWGRDK 480 
YGVFPLRGKI LNVREASHKQ IMENAEINNI IKIVGLGYKK NYEDEDSLKT LRYGKIMIMT" 540 
DQDQDGSHIK GLLINFIHHN WPSLLRliRFL EEFITPIVKV SKNKQEMAFY SLPEFEEWKS 600 
STPNHKKWKV KYYKGLGTST SKEAKEYFAD MKRHRIQFKY SGPEDDAAIS LAFSKKQIDD 660 
RKEWLTNFME DRRQRKLLGL PEDYLYGQTT TYLTYNDFIN KELILFSNSD NERSIPSMVD 720 
GLKPGQRKVL FTCFKRNDKR EVKVAQLAGS VAEMSSYHHG EMSLMMTI IN LAQNFVGSNN 780 
LNLLQPIGQP GTRLHGGKDS ASPRYIFTML SSLARLLFPP KDDHTLKPLY DDNQRVEPEW 840 
YIPIIPMVLI NGAEGIGTGW SCKIPNFDVR EIVNNIRRLM DGEEPLPMLP SYKNFKGTIE 900 
ELAPNQYVIS GEVAILNSTT IEISELPVRT WTQTYKEQVL EPMLNGTEKT PPLITDYREY 960 

HTDTTVKFW KMTEEKLAEA ERVGLHKVFK LQTSLTCNSM VLFDHVGCLK KYDTVLDILR 1020 

DFFELRLKYY GLRKEWLLGM LGAESAKLNN QARFILEKID GKI IIENKPK KELIKVLIQR 1080 

GYDSDPVKAW KEAQQKVPDE EENEESDNEK ETEKSDSVTD SGPTFNYLLD MPLWYLTKEK 1140 

KDELCRLRNE KEQELDTLKR KSPSDLWKED LATFIEELEA VEAKEKQDEQ VGLPGKGGKA 1200 

KGKKTQMAEV LPSPRGQRVI PRITIEMKAE AEKKNKKKIK NENTEGSPQE DGVELEGLKQ 1260 

RLEKKQKREP GTKTKKQTTL AFKPIKKGKK RNPWPDSESD RSSDESNFDV PPRETEPRRA 1320 

ATKTKFTMDL DSDEDFSDFD EKTDDEDFVP SDASPPKTKT SPKLSNKELK PQKSWSDLE 1380 

ADDVKGSVPL SSSPPATHPP DETEITNPVP KKNVTVKKTA AKSQSSTSTT GAKKRAAPKG 1440 

TKRDPALNSG VSQKPDPAKT KNRRKRKPST SDDSDSNFER IVSKAVTSKK SKGESDDPHM 1500 
DFDSAVAPRA KSVRAKKPIK YLEESDEDDL P 



Seq ID NO: 179 DNA sequence 
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CACACATAOG CACGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 
CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 120 
CGGCGAGGGG CCGCAGACCG TCTGGAAATG CGAATCCTAA AGCGTTTCCT CGCTTGCATT 180 
CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 
CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 
AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 
CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 
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AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACCGT 480 

GTCAGGGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGCGGAC CGATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAQQ TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG CGATTATTGA TGGAGTCGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTCGAGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGCGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TCGCATAGGG 1560 

ACGAAATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATACAATG GTGAGACACC TCTTCAACCT TCCTACAGTA GTGAAGTCTT TCCTCTAGTC 2460 

ACCCCTTTGT TGCTTGACAA TCAGATCCTC AACACTACCC CTGCTGCTTC AAGTAGTGAT 2520 

TCGGCCTTGC ATGCTACGCC TGTATTTCCC AGTGTCGATG TGTCATTTGA ATCCATCCTG 2580 

TCTTCCTATG ATGGTGCACC TTTGCTTCCA TTTTCCTCTG CTTCCTTCAG TAGTGAATTG 2640 

TTTCGCCATC TGCATACAGT TTCTCAAATC CTTCCACAAG TTACTTCAGC TACCGAGAGT 2100 

GATAAGGTGC CCTTGCATGC TTCTCTGCCA GTGGCTGGGG GTGATTTGCT ATTAGAGCCC 2760 

AGCCTTGCTC AGTATTCTGA TGTGCTGTCC ACTACTCATG CTGCTTCAGA GACGCTGGAA 2820 

TTTGGTAGTG AATCTGGTGT TCTTTATAAA ACGCTTATGT TTTCTCAAGT TGAACCACCC 2880 

AGCAGTGATG CCATGATGCA TGCACGTTCT TCAGGGCCTG AACCTTCTTA TGCCTTGTCT 2940 

GATAATGAGG GCTCCCAACA CATCTTCACT GTTTCTTACA GTTCTGCAAT ACCTGTGCAT 3000 

GATTCTGTGG GTGTAACTTA TCAGGGTTCC TTATTTAGCG GCCCTAGCCA TATACCAATA 3060 

CCTAAGTCTT CGTTAATAAC CCCAACTGCA TCATTACTGC AGCCTACTCA TGCCCTCTCT 3120 

GGTGATGGGG AATGGTCTGG AGCCTCTTCT GATAGTGAAT TTCTTTTACC TGACACAGAT 3180 

GGGCTGACAG CCCTTAACAT TTCTTCACCT GTTTCTGTAG CTGAATTTAC ATATACAACA 3240 

TCTGTGTTTG GTGATGATAA TAAGGCGCTT TCTAAAAGTG AAATAATATA TGGAAATGAG 3300 

ACTGAACTGC AAATTCCTTC TTTCAATGAG ATGGTTTACC CTTCTGAAAG CACAGTCATG 3360 

CCCAACATGT ATGATAATGT AAATAAGTTG AATGCGTCTT TACAAGAAAC CTCTGTTTCC 3420 

ATTTCTAGCA CCAAGGGCAT GTTTCCAGGG TCCCTTGCTC ATACCACCAC TAAGGTTTTT 3480 

GATCATGAGA TTAGTCAAGT TCCAGAAAAT AACTTTTCAG TTCAACCTAC ACATACTGTC 3S40 

TCTCAAGCAT CTGGTGACAC TTCGCTTAAA CCTGTGCTTA GTGCAAACTC AGAGCCAGCA 3600 

TCCTCTGACC CTGCTTCTAG TGAAATGTTA TCTCCTTCAA CTCAGCTCTT ATTTTATGAG 3660 

ACCTCAGCTT CTTTTAGTAC TGAAGTATTG CTACAACCTT CCTTTCAGGC TTCTGATGTT 3720 

GACACCTTGC TTAAAACTGT TCTTCCAGCT GTGCCCAGTG ATCCAATATT GGTTGAAACC 3780 

CCCAAAGTTG ATAAAATTAG TTCTACAATG TTGCATCTCA TTGTATCAAA TTCTGCTTCA 3840 

AGTGAAAACA TGCTGCACTC TACATCTGTA CCAGTTTTTG ATGTGTCGCC TACTTCTCAT 3900 

ATGCACTCTG CTTCACTTCA AGGTTTGACC ATTTCCTATG CAAGTGAGAA ATATGAACCA 3960 

. GTTTTGTTAA AAAGTGAAAG TTCCCACCAA GTGGTACCTT CTTTGTACAG TAATGATGAG 4020 

TTGTTCCAAA CGGCCAATTT GGAGATTAAC CAGGCCCATC CCCCAAAAGG AAGGCATGTA 4080 

TTTGCTACAC CTGTTTTATC AATTGATGAA CCATTAAATA CACTAATAAA TAAGCTTATA 4140 

CATTCCGATG AAATTTTAAC CTCCACCAAA AGTTCTGTTA CTGGTAAGGT ATTTGCTGGT 4200 

ATTCCAACAG TTGCTTCTGA TACATTTGTA TCTACTGATC ATTCTGTTCC TATAGGAAAT 4260 

GGGCATGTTG CCATTACAGC TGTTTCTCCC CACAGAGATG GTTCTGTAAC CTCAACAAAG 4320 

TTGCTGTTTC CTTCTAAGGC AACTTCTGAG CTGAGTCATA GTGCCAAATC TGATGCCGGT 4380 

TTAGTGGGTG GTGGTGAAGA TGGTGACACT GATGATGATG GTGATGATGA TGATGATGAC 4440 

AGAGGTAGTG ATGGCTTATC CATTCATAAG TGTATGTCAT GCTCATCCTA TAGAGAATCA 4500 

CAGGAAAAGG TAATGAATGA TTCAGACACC CACGAAAACA GTCTTATGGA TCAGAATAAT 4560 

CCAATCTCAT ACTCACTATC TGAGAATTCT GAAGAAGATA ATAGAGTCAC AAGTGTATCC 4620 

TCAGACAGTC AAACTGGTAT GGACAGAAGT CCTGGTAAAT CACCATCAGC AAATGGGCTA 4680 

TCCCAAAAGC ACAATGATGG AAAAGAGGAA AATGACATTC AGACTGGTAG TGCTCTGCTT 4740 

CCTCTCAGCC CTGAATCTAA AGCATGGGCA GTTCTGACAA GTGATGAAGA AAGTGGATCA 4800 

GGGCAAGGTA CCTCAGATAG CCTTAATGAG AATGAGACTT CCACAGATTT CAGTTTTGCA 4860 

GACACTAATG AAAAAGATGC TGATGGGATC CTGGCAGCAG GTGACTCAGA AATAACTCCT 4920 

GGATTCCCAC AGTCCCCAAC ATCATCTGTT ACTAGCGAGA ACTCAGAAGT GTTCCACGTT 4980 

TCAGAGGCAG AGGCCAGTAA TAGTAGCCAT GAGTCTCGTA TTGGTCTAGC TGAGGGGTTG 5040 

GAATCCGAGA AGAAGGCAGT TATACCCCTT GTGATCGTGT CAGCCCTGAC TTTTATCTGT 5100 

CTAGTGGTTC TTGTGGGTAT TCTCATCTAC TGGAGGAAAT GCTTCCAGAC TGC ACACT TT 5160 

TACTTAGAGG ACAGTACATC CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 5220 

ATTTCAGATG ATGTCGGAGC AATTCCAATA AAGCACTTTC CAAAGCATGT TGCAGATTTA 5280 

CATGCAAGTA GTGGGTTTAC TGAAGAATTT GAGACACTGA AAGAGTTTTA CCAGGAAGTG 5340 

CAGAGCTGTA CTGTTGACTT AGGTATTACA GCAGACAGCT CCAACCACCC AGACAACAAG 5400 

CACAAGAATC GATACATAAA TATCGTTGCC TATGATCATA GCAGGGTTAA GCTAGCACAG 5460 

CTTGCTGAAA AGGATGGCAA ACTGACTGAT TATATCAATG CCAATTATGT TGATGGCTAC 5520 

AACAGACCAA AAGCTTATAT TGCTGCCCAA GGCCCACTGA AATCCACAGC TGAAGATTTC 5580 

TGGAGAATGA TATGGGAACA TAATGTGGAA GTTATTGTCA TGATAACAAA CCTCGTGGAG 5640 
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AAAGGAAGGA GAAAATGTGA TCAGTACTGG CCTGCCGATG GGAGTGAGGA GTACGGGAAC 5700 

TTTCTGGTCA CTCAGAAGAG TGTGCAAGTG CTTGCCTATT ATACTGTGAG GAATTTTACT 5760 

CTAAGAAACA CAAAAATAAA AAAGGGCTCC CAGAAAGGAA GACCCAGTGG ACGTGTGGTC 5820 

ACACAGTATC ACTACACGCA GTGGCCTGAC ATGGGAGTAC CAGAGTACTC CCTGCCAGTG 5880 

CTGACCTTTG TGAGAAAGGC AGCCTATGCC AAGCGCCATG CAGTGGGGCC TGTTGTCGTC 5940 

CACTGCAGTG CTGGAGTTGG AAGAACAGGC ACATATATTG TGCTAGACAG TATGTTGCAG 6000 

CAGATTCAAC ACGAAGGAAC TGTCAACATA TTTGGCTTCT TAAAACACAT CCGTTCACAA 6060 

AGAAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACTGGTTGAG 6120 

GCCATACTTA GTAAAGAAAC TGAGGTGCTG GACAGTCATA TTCATGCCTA TGTTAATGCA 6180 

CTCCTCATTC CTGGACCAGC AGGCAAAACA AAGCTAGAGA AACAATTCCA GCTCCTGAGC 6240 

CAGTCAAATA TACAGCAGAG TGACTATTCT GCAGCCCTAA AGCAATGCAA CAGGGAAAAG 6300 

AATOGAACTT CTTCTATCAT CCCTGTGGAA AGATCAAGGG TTGGCATTTC ATCCCTGAGT 6360 

GGAGAAGGCA CAGACTACAT CAATGCCTCC TATATCATGG GCTATTACCA GAGCAATGAA 6420 

TTCATCATTA CCCAGCACCC TCTCCTTCAT ACCATCAAGG ATTTCTGGAG GATGATATGG 6480 

GACCATAATG CCCAACTGGT GGTTATGATT CCTGATGGCC AAAACATGGC AGAAGATGAA 6540 

TTTGTTTACT GGCCAAATAA AGATGAGCCT ATAAATTGTG AGAGCTTTAA GGTCACTCTT 6600 

ATGGCTGAAG AACACAAATG TCTATCTAAT GAGGAAAAAC TTATAATTCA GGACTTTATC 6660 

TTAGAAGCTA CACAGGATGA TTATGTACTT GAAGTGAGGC ACTTTCAGTG TCCTAAATGG 6720 

CCAAATCCAG ATAGCCCCAT TAGTAAAACT TTTGAACTTA TAAGTGTTAT AAAAGAAGAA 6780 

GCTGCCAATA GGGATGGGCC TATGATTGTT CATGATGAGC ATGGAGGAGT GACGGCAGGA 6840 

ACTTTCTGTG CTCTGACAAC CCTTATGCAC CAACTAGAAA AAGAAAATTC CGTGGATGTT 6900 

TACCAGGTAG CCAAGATGAT CAATCTGATG AGGCCAGGAG TCTTTGCTGA CATTGAGCAG 6960 

TATCAGTTTC TCTACAAAGT GATCCTCAGC CTTGTGAGCA CAAGGCAGGA AGAGAATCCA 7020 

TCCACCTCTC TGGACAGTAA TGGTGCAGCA TTGCCTGATG GAAATATAGC TGAGAGCTTA 7080 

GAGTCTTTAG TTTAACACAG AAAGGGGTGG GGGGACTCAC ATCTGAGCAT TGTTTTCCTC 7140 

TTCCTAAAAT TAGGCAGGAA AATCAGTCTA GTTCTGTTAT CTGTTGATTT CCCATCACCT 7200 

GACAGTAACT TTCATGACAT AGGATTCTGC CGCCAAATTT ATATCATTAA CAATGTGTGC 7260 

CTTTTTGCAA GACTTGTAAT TTACTTATTA TGTTTGAACT AAAATGATTG AATTTTACAG 7320 

TATTTCTAAG AATGGAATTG TGGTATTTTT TTCTGTATTG ATTTTAACAG AAAATTTCAA 7380 

TTTATAGAGG TTAGGAATTC CAAACTACAG AAAATGTTTG TTTTTAGTGT CAAATTTTTA 7440 

GCTGTATTTG TAGCAATTAT CAGGTTTGCT AGAAATATAA CTTTTAATAC AGTAGCCTGT 7500 

AAATAAAACA CTCTTCCATA TGATATTCAA CATTTTACAA CTGCAGTATT CACCTAAAGT 7560 

AGAAATAATC TGTTACTTAT TGTAAATACT GCCCTAGTGT CTCCATGGAC CAAATTTATA 7620 

TTTATAATTG TAGATTTTTA TATTTTACTA CTGAGTCAAG TTTTCTAGTT CTGTGTAATT 7680 

GTTTAGTTTA ATGACGTAGT TCATTAGCTG GTCTTACTCT ACCAGTTTTC TGACATTGTA 7740 

TTGTGTTACC TAAGTCATTA ACTTTGTTTC AGCATGTAAT TTTAACTTTT GTGGAAAATA 7800 

GAAATACCTT CATTTTGAAA GAAGTTTTTA TGAGAATAAC ACCTTACCAA ACATTGTTCA 7860 

AATGGTTTTT ATCCAAGGAA TTGCAAAAAT AAATATAAAT ATTGCCATTA AAAAAAAAAA 7920 
AAAAAAAAAA AAAAAAAAAA AAAA 
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MRILKRFLAC IQLLCVCRLD WANGYYRQQR KLVEEIGWSY TGALNQKNWG KKYPTCNSPK 60 

QSPINIDEDL TQVNVNLKKL KFQGWDKTSL ENTFIHNTGK TVBINLTNDY RVSGGVSEMV 120 

FKASKITFHW GKCNMSSDGS EHSLEGQKFP LEMQIYCFDA DRFSSFEEAV KGKGKLRALS 180 

IIjFEVGTEEN LDFKAIIDGV ESVSRFGKQA ALDPFILLNL LPNSTDKYYI YNGSLTSPPC 240 

TDTVDWIVFK DTVSISESQL AVFCEVLTMQ QSGYVMLMDY LQNNFREQQY KFSRQVFSSY 300 

TGKEEIHEAV CSSEPENVQA DPENYTSLLV TWERPRWYD TMIEKFAVI*Y QQLDGEDQTK 360 

HEFLTDGYQD LGAILNNLLP NMSYVLQIVA ICTNGLYGKY SDQLIVDMPT DNPELDLFPE 420 

LIGTEEIIKE EEEGKDIEEG AIVNPGRDSA TNQIRKKEPQ ISTTTHYNRI GTKYNBAKTN 480 

RSPTRGSEFS GKGDVPNTSL NSTSQPVTKL ATEKDI SLTS QTVTEIjPPHT VEGTSASLND 540 

GSKTVLRSPH MNLSGTAESL NTVSITEYEE ESLLTSFKLD TGAEDSSGSS PATSAIPFIS 600 

ENISQGYIFS SENPETITYD VLIPESARNA SEDSTSSGSE ESLKDPSMEG NVWFPSSTDI 660 

TAQPDVGSGR ESFLQTNYTE IRVDESEKTT KSFSAGPVMS QGPSVTDLEM PHYSTFAYFP 720. 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNGETPLQ PSYSSEVFPL VTPLLLDNQI 780 

LNTTPAASSS DSALHATPVF PSVDVSFESI LSSYDGAPLL PFSSASFSSE LFRHLHTVSQ 840 

ILPQVTSATE SDKVPLHASL PVAGGDLLLE PSLAQYSDVL STTHAASETL EFGSESGVLY 900 

KTLMFSQVEP PSSDAMMHAR SSGPEPSYAL SDNEGSQHIF TVSYSSAIPV HDSVGVTYQG 960 

SLFSGPSHIP IPKSSL1TPT ASLLQPTHAL SGDGEWSGAS SDSEFLltPDT DGLTALNISS 1020 

PVSVAEFTYT TSVFGDDNKA LSKSEIIYGN ETELQIPSFN EMVYPSESTV MPNMYDNVNK 1080 

LNASLQETSV SISSTKGMFP GSLAHTTTKV FDHEISQVPE NNFSVQPTHT VSQASGDTSL 1140 

KPVLSANSEP ASSDPASSEM LSPSTQLLFY ETSASFSTEV LLQPSFQASD VDTLLKTVLP 1200 

AVPSDPILVE TPKVDKISST MLHLIVSNSA SSENMJjHSTS VPVFDVSPTS HMHSASLQGL 1260 

TISYASEKYE PVLLKSESSH QWPSLYSND ELFQTANLEI NQAHPPKGRH VFATPVfcSID 1320 

EPIjNTLINK1» IHSDEILTST KSSVTGKVFA GIPTVASDTF VSTDHSVPIG NGHVAITAVS 1380 

PHRDGSVTST KLLFPSKATS ELSHSAKSDA GLVGGGEDGD TDDDGDDDDD DRGSDGLSIH 1440 

KCMSCSSYRE SQEKVMNDSD THENSLMDQN NPISYSLSEN SEEDNRVTSV SSDSQTGMDR 1500 

SPGKSPSANG LSQKHNDGKE ENDIQTGSAL LPLSPESKAW AVLTSDEESG SGQGTSDSLN 1560 

ENETSTDFSF ADTNEKDADG ILAAGDSEIT PGFPQSPTSS VTSENSEVFH VSEAEASNSS 1620 

HESRIGLAEG LESEKKAVIP LVIVSALTFI CLWLVGILI YWRKCFQTAH FYLEDSTSPR 1680 

VISTPPTPIP PISDDVGAIP IKHFPKHVAD IrHASSGFTBE FETLKEPYQE VQSCTVDLGI 1740 

TADSSNHPDN KHKNRYINIV AYDHSRVKLA QLAEKDGKLT DYINANYVDG YNRPKAYIAA 1800 

QGPLKSTAED FWRMIWEHNV EVXVMITNLV EKGRRKCDQY WPADGSEEYG NFLVTQKSVQ 1860 

VLAYYTVRNF TLRNTKIKKG SQKGRPSGRV VTQYHYTQWP DMGVPEYSLP VLTFVRKAAY 1920 

AKRHAVGPW VHCSAGVGRT GTYIVLDSML QQIQHEGTVN IFGFLKHIRS QRNYLVQTEE 1980 

QYVFIHDTLV EAILSKETEV LDSHIHAYVN ALLIPGPAGK TKLEKQFQLI* SQSN1QQSDY 2040 

SAALKQCNRE KKRTSSIIPV ERSRVGISSL SGEGTDYINA SYIMGYYQSN EFIITQHPLL 2100 

HTIKDFWRMI WDHNAQLWM IPDGQNMAED EFVYWPNKDE PINCESFKVT LMAEEHKCLS 2160 

NBEKLIIQDF ILEATQDDYV 1»EVRHFQCPK VJPNPDSPISK TFELiISVlKB EAANRDGPMI 2220 

VHDEHGGVTA GTFCALTTLM HQLEKENSVD VYQVAKMINL MRPGVFADIE QYQFLYKVIL 2280 
SLVSTRQEEN PSTSLDSNGA ALPDGNIAES LESLV 

Seq ID NO: 181 DNA sequence 

Nucleic Acid Accession tfi Eos sequence 
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Coding sequence: 148-4518 



1 11 21 31 41 51 

I I I I I I 

CACACATACG CACGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 120 

0G606A66G0 CCGCAGACCG TCTGGAAATG CGAATCCTAA AGCGTTTCCT CGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACCGT 480 

GTCAGCGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGCGGAC CGATTTTCAA GTTTTG AGGA AGCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG CGATTATTGA TGGAGTCGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTCGAGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGCGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TCGCATAGGG 1560 

ACGAAATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 
GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG * 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATACAATG CAGAGGCCAG TAATAGTAGC CATGAGTCTC GTATTGGTCT AGCTGAGGGG 2460 

TTGGAATCCG AGAAGAAGGC AGTTATACCC CTTGTGATCG TGTCAGCCCT GACTTTTATC 2520 

TGTCTAGTGG TTCTTGTGGG TATTCTCATC TACTGGAGGA AATGCTTCCA GACTGCACAC 2580 

TTTTACTTAG AGGACAGTAC ATCCCCTAGA GTTATATCCA CACCTCCAAC ACCTATCTTT 2640 

CCAATTTCAG ATGATGTCGG AGCAATTCCA ATAAAGCACT TTCCAAAGCA TGTTGCAGAT 2700 

TTACATGCAA GTAGTGGGTT TACTGAAGAA TTTGAGACAC TGAAAGAGTT TTACCAGGAA 2760 

GTGCAGAGCT GTACTGTTGA CTTAGGTATT ACAGCAGACA GCTCCAACCA CCCAGACAAC 2820 

AAGCACAAGA ATCGATACAT AAATATCGTT GCCTATGATC ATAGCAGGGT TAAGCTAGCA 2880 

CAGCTTGCTG AAAAGGATGG CAAACTGACT GATTATATCA ATGCCAATTA TGTTGATGGC 2940 

TACAACAGAC CAAAAGCTTA TATTGCTGCC CAAGGCCCAC TGAAATCCAC AGCTGAAGAT 30O0 

TTCTGGAGAA TGATATGGGA ACATAATGTG GAAGTTATTG TCATGATAAC AAACCTCGTG 3060 

GAGAAAGGAA GGAGAAAATG TGATCAGTAC TGGCCTGCCG ATGGGAGTGA GGAGTACGGG 3120 

AACTTTCTGG TCACTCAGAA GAGTGTGCAA GTGCTTGCCT ATTATACTGT GAGGAATTTT 3180 

ACTCTAAGAA ACACAAAAAT AAAAAAGGGC TCCCAGAAAG GAAGACCCAG TGGACGTGTG 3240 

GTCACACAGT ATCACTACAC GCAGTGGCCT GACATGGGAG TACCAGAGTA CTCCCTGCCA 3300 

GTGCTGACCT TTGTGAGAAA GGCAGCCTAT GCCAAGCGCC ATGCAGTGGG GCCTGTTGTC 3360 

GTCCACTGCA GTGCTGGAGT TGGAAGAACA GGCACATATA TTGTGCTAGA CAGTATGTTG 3420 

CAGCAGATTC AACACGAAGG AACTGTCAAC ATATTTGGCT TCTTAAAACA CATCCGTTCA 3480 

CAAAGAAATT ATTTGGTACA AACTGAGGAG CAATATGTCT TCATTCATGA TACACTGGTT 3540 

GAGGCCATAC TTAGTAAAGA AACTGAGGTG CTGGACAGTC ATATTCATGC CTATGTTAAT 3600 

GCACTCCTCA TTCCTGGACC AGCAGGCAAA ACAAAGCTAG AGAAACAATT CCAGCTCCTG 3660 

AGCCAGTCAA ATATACAGCA GAGTGACTAT TCTGCAGCCC TAAAGCAATG CAACAGGGAA 3720 

AAGAATCGAA CTTCTTCTAT CATCCCTGTG GAAAGATCAA GGGTTGGCAT TTCATCCCTG 3780 

AGTGGAGAAG GCACAGACTA CATCAATGCC TCCTATATCA TGGGCTATTA CCAGAGCAAT 3840 

GAATTCATCA TTACCCAGCA CCCTCTCCTT CATACCATCA AGGATTTCTG GAGGATGATA 3900 

TGGGACCATA ATGCCCAACT GGTGGTTATG ATTCCTGATG GCCAAAACAT GGCAGAAGAT 3960 

GAATTTGTTT ACTGGCCAAA TAAAGATGAG CCTATAAATT GTGAGAGCTT TAAGGTCACT 4020 

CTTATGGCTG AAGAACACAA ATGTCTATCT AATGAGGAAA AACTTATAAT TCAGGACTTT 4080 

ATCTTAGAAG CTACACAGGA TGATTATGTA CTTGAAGTGA GGCACTTTCA GTGTCCTAAA 4140 

TGGCCAAATC CAGATAGCCC CATTAGTAAA ACTTTTGAAC TTATAAGTGT TATAAAAGAA 4200 

GAAGCTGCCA ATAGGGATGG GCCTATGATT GTTCATGATG AGCATGGAGG AGTGACGGCA 4260 

GGAACTTTCT GTGCTCTGAC AACCCTTATG CACCAACTAG AAAAAGAAAA TTCCGTGGAT 4320 

GTTTACCAGG TAGCCAAGAT GATCAATCTG ATGAGGCCAG GAGTCTTTGC TGACATTGAG 4380 

CAGTATCAGT TTCTCTACAA AGTGATCCTC AGCCTTGTGA GCACAAGGCA GGAAGAGAAT 4440 

CCATCCACCT CTCTGGACAG TAATGGTGCA GCATTGCCTG ATGGAAATAT AGCTGAGAGC 4500 

TTAGAGTCTT TAGTTTAACA CAGAAAGGGG TGGGGGGACT CACATCTGAG CATTGTTTTC 4560 

CTCTTCCTAA AATTAGGCAG GAAAATCAGT CTAGTTCTGT TATCTGTTGA TTTCCCATCA 4620 

CCTGACAGTA ACTTTCATGA CATAGGATTC TGCCGCCAAA TTTATATCAT TAACAATGTG 4680 

TGCCTTTTTG CAAGACTTGT AATTTACTTA TTATGTTTGA ACTAAAATGA TTGAATTTTA 4740 

CAGTATTTCT AAGAATGGAA TTGTGGTATT TTTTTCTGTA TTGATTTTAA CAGAAAATTT 4800 

CAATTTATAG AGGTTAGGAA TTCCAAACTA CAGAAAATGT TTGTTTTTAG TGTCAAATTT 4860 

TTAGCTGTAT TTGTAGCAAT TATCAGGTTT GCTAGAAATA TAACTTTTAA TACAGTAGCC 4920 

TGTAAATAAA ACACTCTTCC ATATGATATT CAACATTTTA CAACTGCAGT ATTCACCTAA 4980 
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AGTAGAAATA ATCTGTTACT TATTGTAAAT ACTGCCCTAG TGTCTCCATG GACCAAATTT 5040 

ATATTTATAA TTGTAGATTT TTATATTTTA CTACTGAGTC AAGTTTTCTA GTTCTGTGTA 5100 

ATTGTTTAGT TTAATGAOGT AGTTCATTAG CTGGTCTTAC TCTACCAGTT TTCTGACATT 5160 

GTATTGTGTT ACCTAAGTCA TTAACTTTGT TTCAGCATGT AATTTTAACT TTTGTGGAAA 5220 

ATAGAAATAC CTTCATTTTG AAAGAAGTTT TTATGAGAAT AACACCTTAC CAAACATTGT 5280 

TCAAATGGTT TTTATCCAAG GAATTGCAAA AATAAATATA AATATTGCCA TTAAAAAAAA 5340 
AAAAAAAAAA AAAAAAAAAA AAAAAAA 

Seq ID NO: 182 Protein sequence: 
Protein Accession #: Eos sequence 

1 11 21 31 41 51 

I I I I I I 

MRILKRPLAC IQLLCVCRLD WANGYYRQQR KLVEEIGWSY TGALNQKNWG KKYPTCNSPK 60 

QSPINIDEDL TQVNVNLKKL KFQGWDKTSL ENTFIHNTGK TVEINLTNDY RVSGGVSEMV 120 

PKASKITFHW GKCNMSSDGS EHSLEGQKFP LEMQIYCFDA DRFSSFEEAV KGKGKLRALS 180 

ILFEVGTEEN LDPKAIIDGV ESVSRFGKQA ALDPFILLNL LPNSTDKYYI YNGSLTSPPC 240 

TDTVDWIVFK DTVSISESQL AVFCEVLTMQ QSGYVMLMDY LQNNFREQQY KFSRQVFSSY 300 

TGKEEIHEAV CSSEPENVQA DPENYTSLLV TWERPRWYD TMIEKFAVLY QQLDGEDQTK 360 

HEFLTDGYQD IiGAILNNLLP NMSYVLQIVA ICTNGLYGKY SDQLIVDMPT DNPELDLFPE 420 

LIGTEEIIKE EEEGKDIEEG AIVNPGRDSA TNQIRKKEPQ ISTTTHYNRI GTKYNEAKTN 480 

RSPTRGSEFS GKGDVPNTSL NSTSQPVTKL ATEKDISLTS QTVTELPPHT VEGTSASLND 540 

GSKTVLRSPH MNLSGTAESL NTVSITEYEE ESLLTSFKLD TGAEDSSGSS PATSAIPFIS 600 

ENISQGYIFS SENPETITYD VLIPESARNA SEDSTSSGSE ESLKDPSMEG NVWFPSSTDI 660 

TAQPDVGSGR ESPLQTNYTE IRVDESEKTT KSFSAGPVMS QGPSVTDLEM PHYSTFAYFP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNAEASNS SHESRIGLAE GLESEKKAVI 780 

PLVIVSALTF ICLWLVGIL IYWRKCFQTA HFYLEDSTSP RVISTPPTPI FPISDDVGAI 840 

PIKHFPKHVA DLHASSGFTE EFETLKEFYQ EVQSCTVDLG ITADSSNHPD NKHKNRYINI 900 

VAYDHSRVKL AQLAEKDGKL TDYINANYVD GYNRPKAYIA AQGPLKSTAE DFWRMIWEHN 960 

VEVIVMITNL VEKGRRKCDQ YWPADGSEEY GNFLVTQKSV QVLAYYTVRN FTLRNTKIKK 1020 

GSQKGRPSGR WTQYHYTQW PDMGVPEYSL PVLTFVRKAA YAKRHAVGPV WHCSAGVGR 1080 

TGTYIVLDSM LQQIQHEGTV NIFGFLKHIR SQRNYLVQTE EQYVFIHDTL VEAILSKETE 1140 

VLDSHIHAYV NALLIPGPAG KTKLEKQFQL LSQSNIQQSD YSAALKQCNR EKNRTSSIIP 1200 

VERSRVGISS LSGEGTDYIN ASYIMGYYQS KEFIITQHPL LHTIKDFWRM IWDHNAQLW 1260 

MIPDGQNMAE DEPVYWPNKD EPINCESFKV TLMAEEHKCL SNEEKLIIQD FILEATQDDY 1320 

VLEVRHFQCP KWPNPDSPIS KTFELISVIK EEAANRDGPM IVHDEHGGVT AGTFCALTTL 1380 

MHQLEKENSV DVYQVAKMIN LMRPGVFADI EQYQFLYKVI LSLVSTRQEE NPSTSLDSNG 1440 
AALPDGNIAE SLESLV 

Seq ID NOj 183 DNA sequence 

Nucleic Acid Accession #: EOS sequence 

Coding sequence: 148-4494 

1 11 21 31 41 51 

I I I I I I 

CACACATACG CACGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 120 

CGGCGAGGGG CCGCAGACCG TCTGGAAATG CGAATCCTAA AGCGTTTCCT CGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACCGT 480 

GTCAGCGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGCAGAC CGATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG CGATTATTGA TGGAGTCGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTOGAGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGCGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TCGCATAGGG 1560 

ACGAAATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATACAATG AGGCCAGTAA TAGTAGCCAT GAGTCTCGTA TTGGTCTAGC TGAGGGGTTG 2460 
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GAATCCGAGA AGAAGGCAGT TATACCCCTT GTGATCGTGT CAGCCCTGAC TTTTAT CTGT 2520 

CTAGTGGTTC TTGTGGGTAT TCTCATCTAC TGGAGGAAAT GCTTCCAGAC TGC ACAC TTT 2580 

TACTTAGAGG ACAGTACATC CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 2640 

ATTTCAGATG ATGTCGGAGC AATTCCAATA AAGCACTTTC CAAAGCATGT TGCAGATTTA 2700 

CATGCAAGTA GTGGGTTTAC TGAAGAATTT GAGGAAGTGC AGAGCTGTAC TGTTGACTTA 2760 

GGTATTACAG CAGACAGCTC CAACCACCCA GACAACAAGC ACAAGAATOG ATACATAAAT 2820 

ATOGTTGCCT ATGATCATAG CAGGGTTAAG CTAGCACAGC TTGCTGAAAA GGATGGCAAA 2880 

CTGACTGATT ATATCAATGC CAATTATGTT GATGGCTACA ACAGACCAAA AGCTTATATT 2940 

GCTGCCCAAG GCCCACTGAA ATCCACAGCT GAAGATTTCT GGAGAATGAT ATGGGAACAT 3000 

AATGTGGAAG TTATTGTCAT GATAACAAAC CTCGTGGAGA AAGGAAGGAG AAAATGTGAT 3060 

CAGTACTGGC CTGCCGATGG GAGTGAGGAG TACGGGAACT TTCTGGTCAC TCAGAAGAGT 3120 

GTGCAAGTGC TTGCCTATTA TACTGTGAGG AATTTTACTC TAAGAAACAC AAAAATAAAA 3180 

AAGGGCTCCC AGAAAGGAAG ACCCAGTGGA CGTGTGGTCA CACAGTATCA CTACACGCAG 3240 

TGGCCTGACA TGGGAGTACC AGAGTACTCC CTGCCAGTGC TGACCTTTGT GAGAAAGGCA 3300 

GCCTATGCCA AGCGCCATGC AGTGGGGCCT GTTGTCGTCC ACTGCAGTGC TGGAGTTGGA 3360 

AGAACAGGCA CATATATTGT GCTAGACAGT ATGTTGCAGC AGATTCAACA CGAAGGAACT 3420 

GTCAACATAT TTGGCTTCTT AAAACACATC CGTTCACAAA GAAATTATTT GGTACAAACT 3480 

GAGGAGCAAT ATGTCTTCAT TCATGATACA CTGGTTGAGG CCATACTTAG TAAAGAAACT 3540 

GAGGTGCTGG ACAGTCATAT TCATGCCTAT GTTAATGCAC TCCTCATTCC TGGACCAGCA 3600 

GGCAAAACAA AGCTAGAGAA ACAATTCCAG CTCCTGAGCC AGTCAAATAT ACAGCAGAGT 3660 

GACTATTCTG CAGCCCTAAA GCAATGCAAC AGGGAAAAGA ATCGAACTTC TTCTATCATC 3720 

CCTGTGGAAA GATCAAGGGT TGGCATTTCA TCCCTGAGTG GAGAAGGCAC AGACTACATC 3780 

AATGCCTCCT ATATCATGGG CTATTACCAG AGCAATGAAT TCATCATTAC CCAGCACCCT 3640 

CTCCTTCATA CCATCAAGGA TTTCTGGAGG ATGATATGGG ACCATAATGC CCAACTGGTG 3900 

GTTATGATTC CTGATGGCCA AAACATGGCA GAAGATGAAT TTGTTTACTG GCCAAATAAA 3960 

GATGAGCCTA TAAATTGTGA GAGCTTTAAG GTCACTCTTA TGGCTGAAGA ACACAAATGT 4020 

CTATCTAATG AGGAAAAACT TATAATTCAG GACTTTATCT TAGAAGCTAC ACAGGATGAT 4080 

TATGTACTTG AAGTGAGGCA CTTTCAGTGT CCTAAATGGC CAAATCCAGA TAGCCCCATT 4140 

AGTAAAACTT TTGAACTTAT AAGTGTTATA AAAGAAGAAG CTGCCAATAG GGATGGGCCT 4200 

ATGATTGTTC ATGATGAGCA TGGAGGAGTG ACGGCAGGAA CTTTCTGTGC TCTGACAACC 4260 

CTTATGCACC AACTAGAAAA AGAAAATTCC GTGGATGTTT ACCAGGTAGC CAAGATGATC 4320 

AATCTGATGA GGCCAGGAGT CTTTGCTGAC ATTGAGCAGT ATCAGTTTCT CTACAAAGTG 4380 

ATCCTCAGCC TTGTGAGCAC AAGGCAGGAA GAGAATCCAT CCACCTCTCT GGACAGTAAT 4440 

GGTGCAGCAT TGCCTGATGG AAATATAGCT GAGAGCTTAG AGTCTTTAGT TTAACACAGA 4500 

AAGGGGTGGG GGGACTCACA TCTGAGCATT GTTTTCCTCT TCCTAAAATT AGGCAGGAAA 4560 

ATCAGTCTAG TTCTGTTATC TGTTGATTTC CCATCACCTG ACAGTAACTT TCATGACATA 4620 

GGATTCTGCC GCCAAATTTA TATCATTAAC AATGTGTGCC TTTTTGCAAG ACTTGTAATT 4680 

TACTTATTAT GTTTGAACTA AAATGATTGA ATTTTACAGT ATTTCTAAGA ATGGAATTGT 4740 

GGTATTTTTT TCTGTATTGA TTTTAACAGA AAATTTCAAT TTATAGAGGT TAGGAATTCC 4800 

AAACTACAGA AAATGTTTGT TTTTAGTGTC AAATTTTTAG CTGTATTTGT AGCAATTATC 4860 

AGGTTTGCTA GAAATATAAC TTTTAATACA GTAGCCTGTA AATAAAACAC TCTTCCATAT 4920 

GATATTCAAC ATTTTACAAC TGCAGTATTC ACCTAAAGTA GAAATAATCT GTTACTTATT 4980 

GTAAATACTG CCCTAGTGTC TCCATGGACC AAATTTATAT TTATAATTGT AGATTTTTAT 5040 

ATTTTACTAC TGAGTCAAGT TTTCTAGTTC TGTGTAATTG TTTAGTTTAA TGACGTAGTT 5100 

CATTAGCTGG TCTTACTCTA CCAGTTTTCT GACATTGTAT TGTGTTACCT AAGTCATTAA 5160 

CTTTGTTTCA GCATGTAATT TTAACTTTTG TGGAAAATAG AAATACCTTC ATTTTGAAAG 5220 

AAGTTTTTAT GAGAATAACA CCTTACCAAA CATTGTTCAA ATGGTTTTTA TCCAAGGAAT 5280 

TGCAAAAATA AATATAAATA TTGCCATTAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 5340 
AAA 

Seq ID NO: 184 Protein sequence: 
Protein Accession #: BOS sequence 

1 11 21 31 41 51 

I I I I I I 

MRILKRFLAC IQLLCVCRLD WANGYYRQQR KLVEEIGWSY TGALNQKNWG KKYPTCNSPK 60 

QSPINIDEDL TQVNVNLKKL KFQGWDKTSL ENTFIHNTGK TVEINLTNDY RVSGGVSEMV 120 

FKASKITFHW GKCNMSSDGS EHSLEGQKFP LEMQIYCFDA DRFSSFEEAV KGKGKIjRALS 180 

ILFEVGTEEN LDFKAIIDGV ESVSRFGKQA ALDPFILLNL LPNSTDKYYI YNGSLTSPPC 240 

TDTVDWIVFK DTVSISESQL AVFCEVLTMQ QSGYVMLMDY LQNNFREQQY KFSRQVFSSY 300 

TGKEEIHEAV CSSEPENVQA DPENYTSLLV TWERPRVVYD TMIEKFAVLY QQLDGEDQTK 360 

HEFLTDGYQD LGAILNNLLP NMSYVLQXVA ICTNGLYGKY SDQLIVDMPT DNPELDLFPE 420 

LIGTEEI IKE EEEGKDIEEG AIVNPGRDSA TNQIRKKEPQ ISTTTHYNRI GTKYNEAKTN 480 

RSPTRGSEFS GKGDVPNTSL NSTSQPVTKL ATEKDISLTS QTVTELPPHT VEGTSASLND 540 

GSKTVLRSPH MNLSGTAESL NTVSITEYEE ESLLTSFKLD TGAEDSSGSS PATSAIPFIS 600 

ENISQGYIFS SENPETITYD VLIPESARNA SEDSTSSGSE ESLKDPSMEG NVWPPSSTDI 660 

TAQPDVGSGR ESFLQTNYTE IRVDESEKTT KSFSAGPVMS QGPSVTDLEM PHYSTFAYFP .720 

TEVTPHAFTP SSRQQDIiVST VNWYSQTTQ PVYNEASNSS HESRIGLAEG LESEKKAVIP 780 

LVIVSALTFI CLWLVGILI YWRKCFQTAH FYLEDSTSPR VISTPPTPIF PISDDVGAIP 840 

IKHFPKHVAD LHASSGFTEE FEEVQSCTVD LGITADSSNH PDNKHKNRYI NIVAYDHSRV 900 

KLAQIiAEKDG KLTDYINANY VDGYNRPKAY IAAQGPLKST AEDFWRMIWE HNVEVIVMIT 960 

NLVEKGRRKC DQYWPADGSE EYGNFLVTQK SVQVLAYYTV RNFTLRNTKI KKGSQKGRPS 1020 

GRWTQYHYT QWPDMGVPEY SLPVLTFVRK AAYAKRHAVG PVWHCSAGV GRTGTYIVLD 1080 

SMLQQIQHEG TVNIFGFLKH IRSQRNYLVQ TEEQYVFIHD TLVEAILSKE TEVLDSHIHA 1140 

YVNALLIPGP AGKTKLEKQF QLLSQSNIQQ SDYSAALKOC NREKNRTSSI IPVERSRVGI 1200 

SSLSGEGTDY INASYIMGYY QSNEFIITQH PLLHTIKDFW RMIWDHNAQL WMIPDGQNM 1260 

AEDEFVYWPN KDEPINCESF KVTLMAEEHK CLSNEEKLI I QDFILEATQD DYVLEVRHFQ 1320 

CPKWPNPDSP ISKTFELISV IKEEAANRDG PMIVHDEHGG VTAGTFCALT TLMHQLEKEN 1380 

SVDVYQVAKM INLMRPGVFA DIEQYQFLYK VILSLVSTRQ EENPSTSLDS NGAALPDGNI 1440 
AESLESLV 



Seq ID NO: 185 DNA sequence 

Nucleic Acid Accession 8: EOS sequence 

Coding sequence: 501-4514 

1 11 21 31 41 51 
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CACACATACG CACGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 120 

CGGCGAGGGG CCGCAGACCG TCTGGAAATO CGAATCCTAA AGCGTTTCCT CGCTTGCATT 1B0 

CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAT TGGGGAAAGA 300 

AATATCCAAC ATGTAATAGC CCAAAACAAT CTCCTATCAA TATTGATGAA GATCTTACAC 360 

AAGTAAATGT GAATCTTAAG AAACTTAAAT TTCAGGGTTG GGATAAAACA TCATTGGAAA 420 

ACACATTCAT TCATAACACT GGGAAAACAG TGGAAATTAA TCTCACTAAT GACTACCGTG 480 

TCAGCGGAGG AGTTTCAGAA ATGGTGTTTA AAGCAAGCAA GATAACTTTT CACTGGGGAA 540 

AATGCAATAT GTCATCTGAT GGATCAGAGC ATAGTTTAGA AGGACAAAAA TTTCCACTTG 600 

AGATGCAAAT CTACTGCTTT GATGCGGACC GATTTTCAAG TTTTGAGGAA GCAGTCAAAG 660 

GAAAAGGGAA GTTAAGAGCT TTATCCATTT TGTTTGAGGT TGGGACAGAA GAAAATTTGG 720 

ATTTCAAAGC GATTATTGAT GGAGTCGAAA GTGTTAGTOG TTTTGGGAAG CAGGCTGCTT 780 

TAGATCCATT CATACTGTTG AACCTTCTGC CAAACTCAAC TGACAAGTAT TACATTTACA 840 

ATGGCTCATT GACATCTCCT CCCTGCACAG ACACAGTTGA CTGGATTGTT TTTAAAGATA 900 

CAGTTAGCAT CTCTGAAAGC CAGTTGGCTG TTTTTTGTGA AGTTCTTACA ATGCAACAAT 960 

CTGGTTATGT CATGCTGATG GACTACTTAC AAAACAATTT TCGAGAGCAA CAGTACAAGT 1020 

TCTCTAGACA GGTGTTTTCC TCATACACTG GAAAGGAAGA GATTCATGAA GCAGTTTGTA 1080 

GTTCAGAACC AGAAAATGTT CAGGCTGACC CAGAGAATTA TACCAGCCTT CTTGTTACAT 1140 

GGGAAAGACC TCGAGTCGTT TATGATACCA TGATTGAGAA GTTTGCAGTT TTGTACCAGC 1200 

AGTTGGATGG AGAGGACCAA ACCAAGCATG AATTTTTGAC AGATGGCTAT CAAGACTTGG 1260 

GTGCTATTCT CAATAATTTG CTACCCAATA- TGAGTTATGT TCTTCAGATA GTAGCCATAT 1320 

GCACTAATGG CTTATATGGA AAATACAGCG ACCAACTGAT TGTCGACATG CCTACTGATA 1380 

ATCCTGAACT TGATCTTTTC CCTGAATTAA TTGGAACTGA AGAAATAATC AAGGAGGAGG 1440 

AAGAGGGAAA AGACATTGAA GAAGGCGCTA TTGTGAATCC TGGTAGAGAC AGTGCTACAA 1500 

ACCAAATCAG GAAAAAGGAA CCCCAGATTT CTACCACAAC ACACTACAAT CGCATAGGGA 1560 

CGAAATACAA TGAAGCCAAG ACTAACCGAT CCCCAACAAG AGGAAGTGAA TTCTCTGGAA 1620 

AGGGTGATGT TCCCAATACA TCTTTAAATT CCACTTCCCA ACCAGTCACT AAATTAGCCA 1680 

CAGAAAAAGA TATTTCCTTG ACTTCTCAGA CTGTGACTGA ACTGCCACCT CACACTGTGG 1740 

AAGGTACTTC AGCCTCTTTA AATGATGGCT CTAAAACTGT TCTTAGATCT CCACATATGA 1800 

ACTTGTCGGG GACTGCAGAA TCCTTAAATA CAGTTTCTAT AACAGAATAT GAGGAGGAGA I860 

GTTTATTGAC CAGTTTCAAG CTTGATACTG GAGCTGAAGA TTCTTCAGGC TCCAGTCCCG 1920 

CAACTTCTGC TATCCCATTC ATCTCTGAGA ACATATCCCA AGGGTATATA TTTTCCTCCG 19 BO 

AAAACCCAGA GACAATAACA TATGATGTCC TTATACCAGA ATCTGCTAGA AATGCTTCCG 2040 

AAGATTCAAC TTCATCAGGT TCAGAAGAAT CACTAAAGGA TCCTTCTATG GAGGGAAATG 2100 

TGTGGTTTCC TAGCTCTACA GACATAACAG CACAGCCCGA TGTTGGATCA GGCAGAGAGA 2160 

GCTTTCTCCA GACTAATTAC ACTGAGATAC GTGTTGATGA ATCTGAGAAG ACAACCAAGT 2220 

CCTTTTCTGC AGGCCCAGTG ATGTCACAGG GTCCCTCAGT TACAGATCTG GAAATGCCAC 2280 

ATTATTCTAC CTTTGCCTAC TTCCCAACTG AGGTAACACC TCATGCTTTT ACCCCATCCT 2340 

CCAGACAACA GGATTTGGTC TCCACGGTCA ACGTGGTATA CTCGCAGACA ACCCAACCGG 2400 

TATACAATGA GGCCAQTAAT AGTAGCCATG AGTCTCGTAT TGGTCTAGCT GAGGGGTTGG 2460 

AATCCGAGAA GAAGGCAGTT ATACCCCTTG TGATCGTGTC AGCCCTGACT TTTATCTGTC 2520 

TAGTGGTTCT TGTGGGTATT CTCATCTACT GGAGGAAATG CTTCCAGACT GCACACTTTT 2580 

ACTTAGAGGA CAGTACATCC CCTAGAGTTA TATCCACACC TCCAACACCT ATCTTTCCAA 2640 

TTTCAGATGA TGTCGGAGCA ATTCCAATAA AGCACTTTCC AAAGCATGTT GCAGATTTAC 2700 

ATGCAAGTAG TGGGTTTACT GAAGAATTTG AGACACTGAA AGAGTTTTAC CAGGAAGTGC 2760 

AGAGCTGTAC TGTTGACTTA GGTATTACAG CAGACAGCTC CAACCACCCA GACAACAAGC 2820 

ACAAGAATCG ATACATAAAT ATCGTTGCCT ATGATCATAG CAGGGTTAAG CTAGCACAGC 2880 

TTGCTGAAAA GGATGGCAAA CTGACTGATT ATATCAATGC CAATTATGTT GATGGCTACA 2940 

ACAGACCAAA AGCTTATATT GCTGCCCAAG GCCCACTGAA ATCCACAGCT GAAGATTTCT 3000 

GGAGAATGAT ATGGGAACAT AATGTGGAAG TTATTGTCAT GATAACAAAC CTCGTGGAGA 3060 

AAGGAAGGAG AAAATGTGAT CAGTACTGGC CTGCCGATGG GAGTGAGGAG TACGGGAACT 3120 

TTCTGGTCAC TCAGAAGAGT GTGCAAGTGC TTGCCTATTA TACTGTGAGG AATTTTACTC 3180 

TAAGAAACAC AAAAATAAAA AAGGGCTCCC AGAAAGGAAG ACCCAGTGGA CGTGTGGTCA 3240 

CACAGTATCA CTACACGCAG TGGCCTGACA TGGGAGTACC AGAGTACTCC CTGCCAGTGC 3300 

TGACCTTTGT GAGAAAGGCA GCCTATGCCA AGCGCCATGC AGTGGGGCCT GTTGTCGTCC 3360 

ACTGCAGTGC TGGAGTTGGA AGAACAGGCA CATATATTGT GCTAGACAGT ATGTTGCAGC 3420 

AGATTCAACA CGAAGGAACT GTCAACATAT TTGGCTTCTT AAAACACATC CGTTCACAAA 3480 

GAAATTATTT GGTACAAACT GAGGAGCAAT ATGTCTTCAT TCATGATACA CTGGTTGAGG 3540 

CCATACTTAG TAAAGAAACT GAGGTGCTGG ACAGTCATAT TCATGCCTAT GTTAATGCAC 3600 

TCCTCATTCC TGGACCAGCA GGCAAAACAA AGCTAGAGAA ACAATTCCAG CTCCTGAGCC 3660 

AGTCAAATAT ACAGCAGAGT GACTATTCTG CAGCCCTAAA GCAATGCAAC AGGGAAAAGA 3720 

ATCGAACTTC TTCTATCATC CCTGTGGAAA GATCAAGGGT TGGCATTTCA TCCCTGAGTG 3780 

GAGAAGGCAC AGACTACATC AATGCCTCCT ATATCATGGG CTATTACCAG AGCAATGAAT 3840 

TCATCATTAC CCAGCACCCT CTCCTTCATA CCATCAAGGA TTTCTGGAGG ATGATATGGG 3900 

ACCATAATGC CCAACTGGTG GTTATGATTC CTGATGGCCA AAACATGGCA GAAGATGAAT 3960 

TTGTTTACTG GCCAAATAAA GATGAGCCTA TAAATTGTGA GAGCTTTAAG GTCACTCTTA 4020 

TGGCTGAAGA ACACAAATGT CTATCTAATG AGGAAAAACT TATAATTCAG GACTTTATCT 4080 

TAGAAGCTAC ACAGGATGAT TATGTACTTG AAGTGAGGCA CTTTCAGTGT CCTAAATGGC 4140 

CAAATCCAGA TAGCCCCATT AGTAAAACTT TTGAACTTAT AAGTGTTATA AAAGAAGAAG 4200 

CTGCCAATAG GGATGGGCCT ATGATTGTTC ATGATGAGCA TGGAGGAGTG ACGGCAGGAA 4260 

CTTTCTGTGC TCTGACAACC CTTATGCACC AACTAGAAAA AGAAAATTCC GTGGATGTTT 4320 

ACCAGGTAGC CAAGATGATC AATCTGATGA GGCCAGGAGT CTTTGCTGAC ATTGAGCAGT 4380 

ATCAGTTTCT CTACAAAGTG ATCCTCAGCC TTGTGAGCAC AAGGCAGGAA GAGAATCCAT 4440 

CCACCTCTCT GGACAGTAAT GGTGCAGCAT TGCCTGATGG AAATATAGCT GAGAGCTTAG 4500 

AGTCTTTAGT TTAACACAGA AAGGGGTGGG GGGACTCACA TCTGAGCATT GTTTTCCTCT 4560 

TCCTAAAATT AGGCAGGAAA ATCAGTCTAG TTCTGTTATC TGTTGATTTC CCATCACCTG 4620 

ACAGTAACTT TCATGACATA GGATTCTGCC GCCAAATTTA TATCATTAAC AATGTGTGCC 4680 

TTTTTGCAAG ACTTGTAATT TACTTATTAT GTTTGAACTA AAATGATTGA ATTTTACAGT 4740 

ATTTCTAAGA ATGGAATTGT GGTATTTTTT TCTGTATTGA TTTTAACAGA AAATTTCAAT 4800 
TTATAGAGGT TAGGAATTCC AAACTACAGA AAATGTTTGT TTTTAGTGTC AAATTTTTAG 4860 
CTGTATTTGT AGCAATTATC AGGTTTGCTA GAAATATAAC TTTTAATACA GTAGCCTGTA 4920 
AATAAAACAC TCTTCCATAT GATATTCAAC ATTTTACAAC TGCAGTATTC ACCTAAAGTA 4980 
GAAATAATCT GTTACTTATT GTAAATACTG CCCTAGTGTC TCCATGGACC AAATTTATAT 5040 
TTATAATTGT AGATTTTTAT ATTTTACTAC TGAGTCAAGT TTTCTAGTTC «nWTOTO 5100 
TTTAGTTTAA TGACGTAGTT CATTAGCTGG TCTTACTCTA CCAGTTTTCT GACATTGTAT 5160 
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TGTGTTACCT AAGTCATTAA CTTTGTTTCA GCATGTAATT TTAACTTTTG TGGAAAATAG 5220 

AAATACCTTC ATTTTGAAAG AAGTTTTTAT GAGAATAACA CCTTACCAAA CATTGTTCAA 5280 

ATGGTTTTTA TCCAAGGAAT TGCAAAAATA AATATAAATA TTGCCATTAA AAAAAAAAAA 5340 
AAAAAAAAAA AAAAAAAAAA AAA 

Seq ID NO: 186 Protein sequence: 
Protein Accession 8: EOS sequence 

1 11 21 31 41 51 

I I I I I I 

MVFKASKITP HWGKCNMSSD GSEHSLEGQK FPLEMQIYCF DADRFSSFEE AVKGKGKLBA 60 

LSZLFEVGTE ENLDPKAIID GVESVSRFGK QAALDPFILL NLLPNSTDKY YIYNGSLTSP 120 

PCTDTVDWIV FKDTVSISES QLAVFCEVLT MQQSGYVMLM DYLQNNFREQ QYKFSRQVFS 180 

SYTGKEEIHE AVCSSEPENV QADPENYTSL LVTWERPRW YDTMIEKFAV LYQQLDGEDQ 240 

TKHEFLTDGY QDLGAILNNL LPNMSYVLQI VAICTNGLYG KYSDQLIVDM PTDNPELDLF 300 

PELIGTEEII KEEEEGKDIE EGAIVNPGRD SATNQIRKKE PQISTTTHYN RIGTKYNEAK 360 

TNRSPTRGSE FSGKGDVPNT SLNSTSQPVT KLATEKDISL TSQTVTELPP HTVEGTSASIi 420 

NDGSKTVLRS PHMNLSGTAE SLNTVSITEY EEESLLTSFK LDTGAEDSSG SSPATSAIPF 4B0 

ISENISQGYI FSSENPETIT YDVLIPESAR NASEDSTSSG SEESLKDPSM EGNVWFPSST 540 

DITAQPDVGS GRESFLQTNY TEIRVDESEK TTKSFSAGPV MSQGPSVTDL EMPHYSTFAY 600 

FPTEVTPHAF TPSSRQQDLV STVNWYSQT TQPVYNEASN SSHESRIGLA EGLESEKKAV 660 

IPLVIVSALT FICLWLVGI LIYWRKCFQT AHFYLEDSTS PRVISTPPTP IFPISDDVGA 720 

IPIKHFPKHV ADLHASSGFT EEFETLKEFY QEVQSCTVDL GITADSSNHP DNKHKNRYIN 780 

IVAYDHSRVK LAQLAEKDGK LTDYINANYV DGYNRPKAYI AAQGPLKSTA EDFWRMIWEH 840 

NVEVIVMITN LVEKGRRKCD QYWPADGSEE YGNFLVTQKS VQVLAYYTVR NFTLRNTKIK 900 

KGSQKGRPSG RWTQYHYTQ WPDMGVPEYS LPVLTFVRKA AYAKRHAVGP VWHCSAGVG 960 

RTGTYIVliDS MLQQIQHEGT VNIFGFLKHI RSQRNYLVQT EEQYVFIHDT LVEAILSKET 1020 

EVLDSHIHAY VNALLIPGPA GKTKLEKQFQ LLSQSNIQQS DYSAALKQCN REKNRTSSII 1080 

PVERSRVGIS SLSGEGTDYI NASYIMGYYQ SMEFIITQHP LLKTIKDFWR MIWDHNAQLV 1140 

VMIPDGQNMA EDEFVYWPNK DEPINCESFK VTLMAEEHKC LSNEEKLIIQ DFILEATQDD 1200 

YVLEVRHFQC PKWPNPDSPI SKTFELISVI KEEAANRDGP MIVHDEHGGV TAGTFCALTT 1260 

LMHQLEKENS VDVYQVAKMI NLMRPGVFAD IEQYQFLYKV ILSLVSTRQE ENPSTSLDSN 1320 
GAALPDGNIA ESLESLV 

Seq ID NO: 187 DNA sequence 

Nucleic' Acid Accession #: EOS sequence 

Coding sequence: 148-4632 

1 11 21 31 41 51 

1 I I I I I 

CACACATACG CACGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 120 

CGGCGAGGGG CCGCAGACCG TCTGGAAATG CGAATCCTAA AACGTTTCCT CGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACCGT 480 

GTCAGCGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGCGGAC CGATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG CGATTATTGA TGGAGTCGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTCGAGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGCGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TCGCATAGGG 1560 

ACGAAATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

.GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATACAATG AGGCCAGTAA TAGTAGCCAT GAGTCTCGTA TTGGTCTAGC TGAGGGGTTG 2460 

GAATCCGAGA AGAAGGCAGT TATACCCCTT GTGATCGTGT CAGCCCTGAC TTTTATCTGT 2520 

CTAGTGGTTC TTGTGGGTAT TCTCATCTAC TGGAGGAAAT GCTTCCAGAC TGCACACTTT 2580 

TACTTAGAGG ACAGTACATC CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 2640 

ATTTCAGATG ATGTCGGAGC AATTCCAATA AAGCACTTTC CAAAGCATGT TGCAGATTTA 2700 

CATGCAAGTA GTGGGTTTAC TGAAGAATTT GAGACACTGA AAGAGTTTTA CCAGGAAGTG 2760 

CAGAGCTGTA CTGTTGACTT AGGTATTACA GCAGACAGCT CCAACCACCC AGACAACAAG 2820 
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CACAAGAATC GATACATAAA TATCGTTGCC TATGATCATA 
CTTGCTQAAA AGGATGGCAA ACTGACTGAT TATATCAATG 
AACAGACCAA AAGCTTATAT TGCTGCCCAA GGCCCACTGA 
TGGAGAATGA TATGGGAACA TAATGTGGAA GTTATTGTCA 
AAAGGAAGGA GAAAATGTGA TCAGTACTGG CCTGCCGATG 
TTTCTGGTCA CTCAGAAGAG TGTGCAAGTG CTTGCCTATT 
CTAAGAAACA CAAAAATAAA AAAGGGCTCC CAGAAAGGAA 
ACACAGTATC ACTACACGCA GTGGCCTGAC ATGGGAGTAC 
CTGACCTTTG TGAGAAAGGC AGCCTATGCC AAGCGCCATG 
CACTGCAGTG CTGGAGTTGG AAGAACAGGC ACATATATTG 
CAGATTCAAC ACGAAGGAAC TGTCAACATA TTTGGCTTCT 
AGAAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA 
GCCATACTTA GTAAAGAAAC TGAGGTGCTG GACAGTCATA 
CTCCTCATTC CTGGACCAGC AGGCAAAACA AAGCTAGAGA 
CTGTCACCCA GGCTGGAGTG CAGAGGCACA ATCTOGGCTC 
GGCTTAACTG ATCCTCCTAC CTCAGCCTCC CGAGTGGCTG 
TCAAATATAC AGCAGAGTGA CTATTCTGCA GCCCTAAAGC 
CGAACTTCTT CTATCATCCC TGTGGAAAGA TCAAGGGTTG 
GAAGGCACAG ACTACATCAA TGCCTCCTAT ATCATGGGCT 
ATCATTACCC AGCACCCTCT CCTTCATACC ATCAAGGATT 
CATAATGCCC AACTGGTGGT TATGATTCCT GATGGCCAAA 
GTTTACTGGC CAAATAAAGA TGAGCCTATA AATTGTGAGA 
GCTGAAGAAC ACAAATGTCT ATCTAATGAG GAAAAACTTA 
GAAGCTACAC AGGATGATTA TGTACTTGAA GTGAGGCACT 
AATCCAGATA GCCCCATTAG TAAAACTTTT GAACTTATAA 
GCCAATAGGG ATGGGCCTAT GATTGTTCAT GATGAGCATG 
TTCTGTGCTC TGACAACCCT TATGCACCAA CTAGAAAAAG 
CAGGTAGCCA AGATGATCAA TCTGATGAGG CCAGGAGTCT 
CAGTTTCTCT ACAAAGTGAT CCTCAGCCTT GTGGGCACAA 
ACCTCTCrGG ACAGTAATGG TGCAGCATTG CCTGATGGAA 
TCTTTAGTTT AACACAGAAA GGGGTGGGGG GACTCACATC 
CTAAAATTAG GCAGGAAAAT CAGTCTAGTT CTGTTATCTG 
AGTAACTTTC ATGACATAGG ATTCTGCCGC CAAATTTATA 
TTTGCAAGAC TTGTAATTTA CTTATTATGT TTGAACTAAA 
TTCTAAGAAT GGAATTGTGG TATTTTTTTC TGTATTGATT 
ATAGAGGTTA GGAATTCCAA ACTACAGAAA ATGTTTGTTT 
GTATTTGTAG CAATTATCAG GTTTGCTAGA AATATAACTT 
TAAAACACTC TTCCATATGA TATTCAACAT TTTACAACTG 
AATAATCTGT TACTTATTGT AAATACTGCC CTAGTGTCTC 
ATAATTGTAG ATTTTTATAT TTTACTACTG AGTCAAGTTT 
TAGTTTAATG ACGTAGTTCA TTAGCTGGTC TTACTCTACC 
TGTTACCTAA GTCATTAACT TTGTTTCAGC ATGTAATTTT 
ATACCTTCAT TTTGAAAGAA GTTTTTATGA GAATAACACC 
GGTTTTTATC CAAGGAATTG CAAAAATAAA TATAAATATT 
AAAAAAAAAA AAAAAAAAAA A 

Seq ID NO: 18 B Protein sequence: 
Protein Accession #: EOS sequence 
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GCAGGGTTAA 
CCAATTATGT 
AATCCACAGC 
TGATAACAAA 
GGAGTGAGGA 
ATACTGTGAG 
GACCCAGTGG 
CAGAGTACTC 
CAGTGGGGCC 
TGCTAGACAG 
TAAAACACAT 
TTCATGATAC 
TTCATGCCTA 
AACAATTCCA 
ACTGCAACCT 
GGACTATACT 
AATGCAACAG 
GCATTTCATC 
ATTACCAGAG 
TCTGGAGGAT 
ACATGGCAGA 
GCTTTAAGGT 
TAATTCAGGA 
TTCAGTGTCC 
GTGTTATAAA 
GAGGAGTGAC 
AAAATTCCGT 
TTGCTGACAT 
GGCAGGAAGA 
ATATAGCTGA 
TGAGCATTGT 
TTGATTTCCC 
TCATTAACAA 
ATGATTGAAT 
TTAACAGAAA 
TTAGTGTCAA 
TTAATACAGT 
CAGTATTCAC 
CATGGACCAA 
TCTAGTTCTG 
AGTTTTCTGA 
AACTTTTGTG 
TTACCAAACA 
GCCATTAAAA 



GCTAGCACAG 
TGATGGCTAC 
TGAAGATTTC 
CCTCGTGGAG 
GTACGGGAAC 
GAATTTTACT 
ACGTGTGGTC 
CCTGCCAGTG 
TGTTGTCGTC 
TATGTTGCAG 
CCGTTCACAA 
ACTGGTTGAG 
TGTTAATGCA 
GGGTCTCACT 
TCCTCTCCCT 
CCTGAGCCAG 
GGAAAAGAAT 
CCTGAGTGGA 
CAATGAATTC 
GATATGGGAC 
AGATGAATTT 
CACTCTTATG 
CTTTATCTTA 
TAAATGGCCA 
AGAAGAAGCT 
GGCAGGAACT 
GGATGTTTAC 
TGAGCAGTAT 
GAATCCATCC 
GAG CT TAG AG 
TTTCCTCTTC 
ATCACCTGAC 
TGTGTGCCTT 
TTTACAGTAT 
ATTTCAATTT 
ATTTTTAGCT 
AGCCTGTAAA 
CTAAAGTAGA 
ATTTATATTT 
TGTAATTGTT 
CATTGTATTG 
GAAAATAGAA 
TTGTTCAAAT 
AAAAAAAAAA 



2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 



MRILKRFLAC 
QSP INIDEDL 
FKASKITFHW 
ILFEVGTEEN 
TDTVDWIVFK 
TGKEEIHEAV 
HEFLTDGYQD 
LIGTEEIIKE 
RSPTRGSEFS 
GSKTVLRSPH 
ENISQGYIFS 
TAQPDVGSGR 
TEVTPHAFTP 
LVIVSALTFI 
IKHFPKHVAD 
AYDHSRVKLA 
.EVIVMITNLV 
SQKGRPSGRV 
GTYIVLDSML 
LDSHIHAYVN 
SRVAGTILLS 
YIMGYYQSNE 
INCESFKVTL 
FELISVIKEB 
RPGVFADIEQ 



11 
I 

IQLLCVCRLD 
TQVNVNLKKL 
GKCNMSSDGS 
LDFKAIIDGV 
DTVSISESQL 
CSSEPENVQA 
LGAILNNLLP 
EEEGKDIEEG 
GKGDVPNTSL 
MNLSGTAESL 
SENPETITYD 
ESFLQTNYTE 
SSRQQDLVST 
CLVVLVGILI 
LHASSGFTEE 
QLAEKDGKLT 
EKGRRKCDQY 
VTQYHYTQWP 
QQIQHEGTVN 
ALLIPGPAGK 
QSNIQQSDYS 
FIITQHPLLH 
MAEEHKCLSN 
AANRDGPMIV 
YQFLYKVILS 



21 
I 

WANGYYRQQR 
KFQGWDKTSL 
EHSLEGQKFP 
ESVSRFGKQA 
AVFCEVLTMQ 
DPENYTSLI>V 
NMSYVLQIVA 
AIVNPGRDSA 
NSTSQPVTKL 
NTVSITEYEE 
VLIPESARNA 
IRVDESEKTT 
VNWYSQTTQ 
YWRKCFQTAH 
FETLKEFYQE 
DYINANYVDG 
WPADGSEEYG 
DMGVPEYSIiP 
IFGFLKHIRS 
TKLEKQFQGL 
AALKQCNREK 
TIKDFWRMIW 
EEKLIIQDPI 
HDEHGGVTAG 
LVGTRQEENP 



31 
I 

KLVEEIGWSY 
ENTFIHNTGK 
LEMQIYCFDA 
ALDPFIIiLNL 
QSGYVMfcMDY 
TWERPRWYD 
ICTNGLYGKY 
TNQIRKKEPQ 
ATEKDISLTS 
ESLLTSFKLD 
SEDSTSSGSE 
KSFSAGPVMS 
PVYNEASNSS 
FYLEDSTSPR 
VQSCTVDLGI 
YNRPKAYIAA 
NFLVTQKSVQ 
VLTFVRKAAY 
QRNYLVQTEE 
TLSPRLECRG 
NRTSSIIPVE 
DHNAQLWMI 
LEATQDDYVL 
TFCALTTLMH 
STSLDSNGAA 



41 

I 

TGALNQKNWG 
TVEINLTNDY 
DRFSSFEEAV 
LPNSTDKYYI 
LQNNFREQQY 
TMIEKFAVLY 
SDQLIVDMPT 
ISTTTHYNRI 
QTVTELPPHT 
TGAEDSSGSS 
ESLKDPSMEG 
QGPSVTDLEM 
HESRIGLAEG 
VISTPPTPIF 
TADSSNHPDN 
QGPLKSTAED 
VLAYYTVRNF 
AKRHAVGPW 
QYVFIHDTLV 
TISAHCNLPL 
RSRVGISSLS 
PDGONMAEDE 
EVRHFQCPKW 
QLEKENSVDV 
LPDGNIAESL 



51 
I 

KKYPTCNSPK 
RVSGGVSEMV 
KGKGKLRALS 
YNGSLTSPPC 
KFSRQVFSSY 
QQLDGEDQTK 
DNPELDLFPE 
GTKYNEAKTN 
VEGTSASLND 
PATSAIPFIS 
NVWFPSSTDI 
PHYSTFAYFP 
LESEKKAVIP 
PISDDVGAIP 
KHKNRYINIV 
FWRMIWEHNV 
TLRNTKIKKG 
VHCSAGVGRT 
EAILSKETEV 
PGLTDPPTSA 
GEGTDY INAS 
FVYWPNKDEP 
PNPDSPISKT 
YQVAKMXNLM 
ESLV 



Seq ID NO: 189 DNA sequence 
Nucleic Acid Accession ft: NMJ>02820 
Coding sequence i 304.. 831 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 



11 



21 31 41 51 

I I I I 

CCGGTTCGCA AAGAAGCTGA CTTCAGAGGG GGAAACTTTC TTCTTTTAGG AGGCGGTTAG 
CCCTGTTCCA CGAACCCAGG AGAACTGCTG GCCAGATTAA TTAGACATTG CTATGGGAGA 
CGTGTAAACA CACTACTTAT CATTGATGCA TATATAAAAC CATTTTATTT TCGCTATTAT 



60 
120 
180 



263 



10 



15 



20 



25 



WO 02/086443 

TTCAGAGGAA GCGCCTCTGA TTTGTTTCTT TTTTCCCTTT TTGCTCTTTC TGGCTGTGTG 240 

GTTTGGAGAA AGCACAGTTG GAGTAGCCGG TTGCTAAATA AGTCCCGAGC GOGAGCGGAG 300 

ACGATGCAGC GGAGACTGGT TCAGCAGTGG AGCGTCGCGG TGTTCCTGCT GAGCTACGCG 360 

GTGCCCTCCT GCGGGCGCTC GGTGGAGGGT CTCAGCCGCC GCCTCAAAAG AGCTGTGTCT 420 

GAACATCAGC TCCTCCATGA CAAGGGGAAG TCCATCCAAG ATTTACGGCG ACGATTCTTC 480 

CTTCACCATC TGATCGCAGA AATCCACACA GCTGAAATCA GAGCTACCTC GGAGGTGTCC 540 

CCTAACTCCA AGCCCTCTCC CAACACAAAG AACCACCCCG TCCGATTTGG GTCTGATGAT 600 

GAGGGCAGAT ACCTAACTCA GGAAACTAAC AAGGTGGAGA CGTACAAAGA GCAGCCGCTC 660 

AAGACACCTG GGAAGAAAAA GAAAGGCAAG CCCGGGAAAC GCAAGGAGCA GGAAAAGAAA 720 

AAACGGCGAA CTCGCTCTGC CTGGTTAGAC TCTGGAGTGA CTGGGAGTGG GCTAGAAGGG 780 

GACCACCTGT CTGACACCTC CACAACGTCG CTGGAGCTCG ATTCACGGTA ACAGGCTTCT 840 

CTGGCCCGTA GCCTCAGCGG GGTGCTCTCA GCTGGGTTTT GGAGCCTCCC TTCTGCCTTG 900 

GCTTGGACAA ACCTAGAATT TTCTCCCTTT ATGTATCTCT ATCGATTGTG TAGCAATTGA 960 

CAGAGAATAA CTCAGAATAT TGTCTGCCTT AAAGCAGTAC CCCCCTACCA CACACACCCC 1020 

TGTCCTCCAG CACCATAGAG AGGCGCTAGA GCCCATTCCT CTTTCTCCAC CGTCACCCAA 1080 

CATCAATCCT TTACCACTCT ACCAAATAAT TTCATATTCA AGCTTCAGAA GCTAGTGACC 1140 

ATCTTCATAA TTTGCTGGAG AAGTGTATTT CTTCCCCTTA CTCTCACACC TGGGCAAACT 1200 

TTCTTCAGTG TTTTTCATTT CTTACGTTCT TTCACTTCAA GGGAGAATAT AGAAGCATTT 1260 

GAT ATT AT CT ACAAACACTG CAGAACAGCA TCATGTCATA AACGATTCTG AGCCATTCAC 1320 

ACTTTTTATT TAATTAAATG TATTTAATTA AATCTCAAAT TTATTTTAAT GTAAAGAACT 1380 

TAAATTATGT TTTAAACACA TGCCTTAAAT TTGTTTAATT AAATTTAACT CTGGTTTCTA 1440 

CCAGCTCATA CAAAATAAAT GGTTTCTGAA AATGTTTAAG TATTAACTTA CAAGGATATA 1500 

GGTTTTTCTC ATGTATCTTT TTGTTCATTG GCAAGATGAA ATAATTTTTC TAGGGTAATG 1560 
CCGTAGGAAA AATAAAACTT CACATTTAAA AAAAA 



PCT/US02/12476 



30 



35 



Seq ID NO: 190 Protein sequence i 
Protein Accession fl: NP 002811 



41 



51 



1 11 21 31 

I I I I I 1 

MQRRLVQQWS VAVPLLSYAV PSCGRSVEGL SRRLKRAVSE HQLLHDKGKS IQDLRRRFFIi 
HHLIAEIHTA EIRATSEVSP NSKPSPNTKN HPVRFGSDDE GRYLTQETNK VETYKEQPLK 
TPGKKKKGKP GKRKEQEKKK RRTOSAWLDS GVTGSGLEGD HLSDTSTTSL ELDSR 



60 
120 



40 



45 



50 



55 



60 



65 



Seq ID NO: 191 DNA sequence 
Nucleic Acid Accession #: XM_059328 
Coding sequence: 52.. 1023 



l 

I 

GGGCTGTCCG 
CCTCGCATGC 
GGTATCGTGG 
GCGGCCAOGG 
GCCAACCTGT 
GGCCCGGAAG 
GTGGATTTGC 
CTGGGCAGGG 
TGCCAGGTGT 
GAGCGCGGTG 
GTGGAGCGCG 
GACGCCTTCG 
GCCCTGGCGC 
GCGCACCCCG 
TTCTCTTGCT 
GCCCAGCTTG 
CCAGGGGAGG 
TGACCCCCTA 
TTAGTCCTGG 
GGACACTGCC 
AGCCTTCTTG 
TGGTGCCCCT 
CTATATTAAT 



11 
I 

GCCCACTCCC 
GCCTGGTGGT 
AGGCC1TTCT 
AGAGCGCGGC 
CCGAGGGCCG 
GCTTCTTCCT 
CTCAGGTGCG 
CCCCCACGCA 
TCGCCGAGGC 
TGGGTGGCTG 
ACGCCCGGGC 
TGGGCCTGAG 
GGGTCCTGGA 
GCTACCCCAG 
CTTGGGAGCG 
CCCAGGATGG 
AGGTCCCCTG 
CAGACAACCA 
CCCAGCCCAG 
ACCTCTGGGC 
GCTGCAGGCA 
CCATGTTGCA 
AAAATAACGT 



21 
I 

CTGGGAGCGC 
CACCGCGGAC 
GGCCGGGGCT 
GGAGCTGGCC 
CCCCGTGGGT 
TGGCAAGATG 
GGAGGAGCTC 
CGCGGACGGG 
GCTGCAGGCC 
CACTTGGCTG 
CGCCGTGGGC 
CACTTGCGGC 
AGGTACCCTA 
TGTGCCTCCC 
GCTGCATGAG 
CGTGCAGCTT 
TGAGCCCACT 
AGCACTAATC 
AGCTGGGACC 
TCAGGTCCTC 
GGCCTAGCCT 
ATGCAAACAC 
GTGTCTTTC 



31 
I 

GAGCGGTGGA 
GACTTTGGTT 
GTGACCAGCG 
CGCAGGCACA 
CCGGCCCGCC 
GGATTCCGGG 
GAGGCCCAAC 
CACCAGCACG 
TATGGGGTGC 
GAGGCCCCCG 
CCCTTCTCCC 
CGGCACATGT 
GCGGGCCACA 
ACCGGCGGCT 
CTGCGCGTCC 
TGCGCCCTCG 
CTGGAACCCT 
CCCTTAGTAC 
TGGAGCACGA 
ATGCCTCCAA 
GTGGCAGCGG 
CTTCACCACT 



41 

I 

CCCAGGCGGC 
ACTGCCCGCG 
TGTCCCTGCT 
GCATCCCCAC 
GTGGCGCCTC 
AGGCGGTGGC 
TAAGCTGCTT 
TGCACGTGCT 
GCTTTACGCG 
CGCGTGCCTT 
GCCACGGCCT 
CCGCTCACCG 
CCCTGACAGC 
GCGGTGAAGG 
TCACCGCGCC 
ACGACCTGGA 
TCCTGGAACC 
CAAGAAAGGG 
TCTGTTGACT 
ATGGCATCTA 
GCTAGGGCCC 
GGGGCAGTGG 



51 
I 

CATGTCCCGC 
ACGCGATGAG 
GGTCAACGGT 
GGGCCTCCAC 
ATCGCTGCTC 
GGCCGGAGAC 
CCGGGAGCTG 
CCCAGGCGTG 
ACTGCCGCTG 
CGCCTGCGCC 
GCGGTGGACA 
CGTGTCCGGG 
CGAGCTGATG 
CCCCGACGCT 
CACGCTGCGG 
CTCCAAGAGG 
CTCCCTACTC 
GAGCCAGGAT 
TCCCTGGGTA 
GAGTTTGAGC 
GCAGAGCATT 
GGAGAGATGG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



Seq ID NO: 192 Protein sequence: 
Protein Accession #: XP 059328 



70 



75 



80 



1 
I 

MSRPRMRLW 
GLHANLSEGR 
RELLGRAPTH 
ACAVERDARA 
ELMAHPGYPS 
SKRPGEEVPC 



11 
I 

TADDFGYCPR 
PVGPARRGAS 
ADGHQHVHVL 
AVGPFSRHGL 
VPPTGGCGEG 
EPTLEPFLEP 



21 31 

I I 
RDEGIVEAPL AGAVTSVSLL 
SLLGPEGFFL GKMGPREAVA 
PGVCQVFAEA LQAYGVRFTR 
RWTDAFVGLS TCGRHMSAHR 
PDAFSCSWER LHELRVLTAP 
SLL 



41 51 
I I 

VNGAATESAA ELARRHSIPT 60 

AGDVDLPQVR EELEAQLSCF 120 

LPLERGVGGC TWLEAPARAP 180 

VSGALARVLE GTLAGHTLTA 240 

TLRAQLAQDG VQLCALDDLD 300 



Seq ID NO: 193 DNA sequence 

Nucleic Acid Accession #: NM_0056B8.1 

Coding sequence: 12 6.. 4 43 9 



1 11 21 31 41 51 

85 | | | | | | 

CCGGGCAGGT GGCTCATGCT CGGGAGCGTG GTTGAGCGGC TGGCGCGGTT GTCCTGGAGC 
AGGGGCGCAG GAATTCTGAT GTGAAACTAA CAGTCTGTGA GCCCTGGAAC CTCCGCTCAG 



60 
120 



264 



WO 02/086443 

AGAAGATGAA GGATATCGAC ATAGGAAAAG AGTATATCAT CCCCAGTCCT GGGTATAGAA 180 

GTGTGAGGGA GAGAACCAGC ACTTCTGGGA CGCACAGAGA COGTGAAGAT TCCAAGTTCA 240 

GGAGAACTCG ACCGTTGGAA TGCCAAGATG CCTTGGAAAC AGCAGCCCGA GCCGAGGGCC 300 

TCTCTCTTGA TGCCTCCATG CATTCTCAGC TCAGAATCCT GGATGAGGAG CATCCCAAGG 360 

GAAAGTACCA TCATGGCTTG AGTGCTCTGA AGCCCATCOG GACTACTTCC AAACACCAGC 420 

ACCCAGTGGA CAATGCTGGG CTTTTTTCCT GTATGACTTT TTCGTGGCTT TCTTCTCTGG 480 

CCCGTGTGGC CCACAAGAAG GGGGAGCTCT CAATGGAAGA CGTGTGGTCT CTGTCCAAGC 540 

ACGAGTCTTC TGACGTGAAC TGCAGAAGAC TAGAGAGACT GTGGCAAGAA GAGCTGAATG 600 

AAGTTGGGCC AGACGCTGCT TCCCTGCGAA GGGTTGTGTG GATCTTCTGC CGCACCAGGC 660 

TCATCCTGTC CATCGTGTGC CTGATGATCA CGCAGCTGGC TGGCTTCAGT GGACCAGCCT 720 

TCATGGTGAA ACACCTCTTG GAGTATACCC AGGCAACAGA GTCTAACCTG CAGTACAGCT 780 

TGTTGTTAGT GCTGGGCCTC CTCCTGACGG AAATCGTGCG GTCTTGGTCG CTTGCACTGA 840 

CTTGGGCATT GAATTACCGA ACCGGTGTCC GCTTGCGGGG GGCCATCCTA ACCATGGCAT 900 

TTAAGAAGAT CCTTAAGTTA AAGAACATTA AAGAGAAATC CCTGGGTGAG CTCATCAACA 960 

TTTGCTCCAA CGATGGGCAG AGAATGTTTG AGGCAGCAGC CGTTGGCAGC CTGCTGGCTG 1020 

GAGGACCCGT TGTTGCCATC TTAGGCATGA TTTATAATGT AATTATTCTG GGACCAACAG 1080 

GCTTCCTGGG ATCAGCTGTT TTTATCCTCT TTTACCCAGC AATGATGTTT GCATCACGGC 1140 

TCACAGCATA TTTCAGGAGA AAATGCGTGG CCGCCACGGA TGAACGTGTC CAGAAGATGA 1200 

ATGAAGTTCT TACTTACATT AAATTTATCA AAATGTATGC CTGGGTCAAA GCATTTTCTC 1260 

AGAGTGTTCA AAAAATCCGC GAGGAGGAGC GTCGGATATT GGAAAAAGCC GGGTACTTCC 1320 

AGGGTATCAC TGTGGGTGTG GCTCCCATTG TGGTGGTGAT TGCCAGCGTG GTGACCTTCT 1380 

CTGTTCATAT GACCCTGGGC TTCGATCTGA CAGCAGCACA GGCTTTCACA GTGGTGACAG 1440 

TCTTCAATTC CATGACTTTT GCTTTGAAAG TAACACCGTT TTCAGTAAAG TCCCTCTCAG 1500 

AAGCCTCAGT GGCTGTTGAC AGATTTAAGA GTTTGTTTCT AATGGAAGAG GTTCACATGA 1560 

TAAAGAACAA ACCAGCCAGT CCTCACATCA AGATAGAGAT GAAAAATGCC ACCTTGGCAT 1620 

GGGACTCCTC CCACTCCAGT ATCCAGAACT CGCCCAAGCT GACCCCCAAA ATGAAAAAAG 1680 

ACAAGAGGGC TTCCAGGGGC AAGAAAGAGA AGGTGAGGCA GCTGCAGCGC ACTGAGCATC 1740 

AGGCGGTGCT GGCAGAGCAG AAAGGCCACC TCCTCCTGGA CAGTGACGAG CGGCCCAGTC 1800 

CCGAAGAGGA AGAAGGCAAG CACATCCACC TGGGCCACCT GCGCTTACAG AGGACACTGC 1860 

ACAGCATCGA TCTGGAGATC CAAGAGGGTA AACTGGTTGG AATCTGCGGC AGTGTGGGAA 1920 

GTGGAAAAAC CTCTCTCATT TCAGCCATTT TAGGCCAGAT GACGCTTCTA GAGGGCAGCA 1980 

TTGCAATCAG TGGAACCTTC GCTTATGTGG CCCAGCAGGC CTGGATCCTC AATGCTACTC 2040 

TGAGAGACAA CATCCTGTTT GGGAAGGAAT ATGATGAAGA AAGATACAAC TCTGTGCTGA 2100 

ACAGCTGCTG CCTGAGGCCT GACCTGGCCA TTCTTCCCAG CAGCGACCTG ACGGAGATTG 2160 

GAGAGCGAGG AGCCAACCTG AGCGGTGGGC AGCGCCAGAG GATCAGCCTT GCCCGGGCCT 2220 

TGTATAGTGA CAGGAGCATC TACATCCTGG ACGACCCCCT CAGTGCCTTA GATGCCCATG 2280 

TGGGCAACCA CATCTTCAAT AGTGCTATCC GGAAACATCT CAAGTCCAAG ACAGTTCTGT 2340 

TTGTTACCCA CCAGTTACAG TACCTGGTTG ACTGTGATGA AGTGATCTTC ATGAAAGAGG 2400 

GCTGTATTAC GGAAAGAGGC ACCCATGAGG AACTGATGAA TTTAAATGGT GACTATGCTA 2460 

CCATTTTTAA TAACCTGTTG CTGGGAGAGA CACCGCCAGT TGAGATCAAT TCAAAAAAGG 2520 

AAACCAGTGG TTCACAGAAG AAGTCACAAG ACAAGGGTCC TAAAACAGGA TCAGTAAAGA 2580 

AGGAAAAAGC AGTAAAGCCA GAGGAAGGGC AGCTTGTGCA GCTGGAAGAG AAAGGGCAGG 2640 

GTTCAGTGCC CTGGTCAGTA TATGGTGTCT ACATCCAGGC TGCTGGGGGC CCCTTGGCAT 2700 

TCCTGGTTAT TATGGCCCTT TTCATGCTGA ATGTAGGCAG CACCGCCTTC AGCACCTGGT 2760 

GGTTGAGTTA CTGGATCAAG CAAGGAAGCG GGAACACCAC TGTGACTCGA GGGAACGAGA 2820 

CCTGGGTGAG TGACAGCATG AAGGACAATC CTCATATGCA GTACTATGCC AGCATCTACG 2880 

CCCTCTCCAT GGCAGTCATG CTGATCCTGA AAGCCATTCG AGGAGTTGTC TTTGTCAAGG 2940 

GCACGCTGCG AGCTTCCTCC CGGCTGCATG ACGAGCTTTT CCGAAGGATC CTTCGAAGCC 3000 

CTATGAAGTT TTTTGACACG ACCCCCACAG GGAGGATTCT CAACAGGTTT TCCAAAGACA 3060 

TGGATGAAGT TGACGTGCGG CTGCCGTTCC AGGCCGAGAT GTTCATCCAG AACGTTATCC 3120 

TGGTGTTCTT CTGTGTGGGA ATGATCGCAG GAGTCTTCCC GTGGTTCCTT GTGGCAGTGG 3180 

GGCCCCTTGT CATCCTCTTT TCAGTCCTGC ACATTGTCTC CAGGGTCCTG ATTCGGGAGC 3240 

TGAAGCGTCT GGACAATATC ACGCAGTCAC CTTTCCTCTC CCACATCACG TCCAGCATAC 3300 

AGGGCCTTGC CACCATCCAC GCCTACAATA AAGGGCAGGA GTTTCTGCAC AGATACCAGG 3360 

AGCTGCTGGA TGACAACCAA GCTCCTTTTT TTTTGTTTAC GTGTGCGATG CGGTGGCTGG 3420 

CTGTGCGGCT GGACCTCATC AGCATCGCCC TCATCACCAC CACGGGGCTG ATGATCGTTC 3480 

TTATGCACGG GCAGATTCCC CCAGCCTATG CGGGTCTCGC CATCTCTTAT GCTGTCCAGT 3540 

TAACGGGGCT GTTCCAGTTT ACGGTCAGAC TGGCATCTGA GACAGAAGCT CGATTCACCT 3600 

CGGTGGAGAG GATCAATCAC TACATTAAGA CTCTGTCCTT GGAAGCACCT GCCAGAATTA 3660 

AGAACAAGGC TCCCTCCCCT GACTGGCCCC AGGAGGGAGA GGTGACCTTT GAGAACGCAG 3720 

AGATGAGGTA CCGAGAAAAC CTCCCTCTTG TCCTAAAGAA AGTATCCTTC ACGATCAAAC* 3780 

CTAAAGAGAA GATTGGCATT GTGGGGCGGA CAGGATCAGG GAAGTCCTCG CTGGGGATGG 3840 

CCCTCTTCCG TCTGGTGGAG TTATCTGGAG GCTGCATCAA GATTGATGGA GTGAGAATCA .3900 

GTGATATTGG CCTTGCCGAC CTCCGAAGCA AACTCTCTAT CATTCCTCAA GAGCCGGTGC 3960 

TGTTCAGTGG CACTGTCAGA TCAAATTTGG ACCCCTTCAA CCAGTACACT GAAGACCAGA 4020 

TTTGGGATGC CCTGGAGAGG ACACACATGA AAGAATGTAT TGCTCAGCTA CCTCTGAAAC 4080 

TTGAATCTGA AGTGATGGAG AATGGGGATA ACTTCTCAGT GGGGGAACGG CAGCTCTTGT 4140 

GCATAGCTAG AGCCCTGCTC CGCCACTGTA AGATTCTGAT TTTAGATGAA GCCACAGCTG 4200 

CCATGGACAC AGAGACAGAC TTATTGATTC AAGAGACCAT CCGAGAAGCA TTTGCAGACT 4260 

GTACCATGCT GACCATTGCC CATCGCCTGC ACACGGTTCT AGGCTCCGAT AGGATTATGG 4320 

TGCTGGCCCA GGGACAGGTG GTGGAGTTTG ACACCCCATC GGTCCTTCTG TCCAACGACA 4380 

GTTCCCGATT CTATGCCATG TTTGCTGCTG CAGAGAACAA GGTCGCTGTC AAGGGCTGAC 4440 

TCCTCCCTGT TGACGAAGTC TC TT TTC T TT AGAGCATTGC CATTCCCTGC CTGGGGCGGG 4500 

CCCCTCATCG CGTCCTCCTA CCGAAACCTT GCCTTTCTCG ATTTTATCTT TCGCACAGCA 4560 

GTTCCGGATT GGCTTGTGTG TTTCACTTTT AGGGAGAGTC ATATTTTGAT TATTGTATTT 4620 

ATTCCATATT CATGTAAACA AAATTTAGTT TTTGTTCTTA ATTGCACTCT AAAAGGTTCA 4680 

GGGAACCGTT ATTATAATTG TATCAGAGGC CTATAATGAA GCTTTATACG TGTAGCTATA 4740 

TCTATATATA ATTCTGTACA TAGCCTATAT TTACAGTGAA AATGTAAGCT GTTTATTTTA 4800 

TATTAAAATA AGCACTGTGC TAATAACAGT GCATATTCCT TTCTATCATT TTTGTACAGT 4860 

TTGCTGTACT AGAGATCTGG TTTTGCTATT AGACTGTAGG AAGAGTAGCA TTTCATTCTT 4920 

CTCTAGCTGG TGGTTTCACG GTGCCAGGTT TTCTGGGTGT CCAAAGGAAG ACGTGTGGCA 4980 

ATAGTGGGCC CTCCGACAGC CCCCTCTGCC GCCTCCCCAC AGCCGCTCCA GGGGTGGCTG 5040 

GAGACGGGTG GGCGGCTGGA GACCATGCAG AGCGCCGTGA GTTCTCAGGG CTCCTGCCTT 5100 

CTGTCCTGGT GTCACTTACT GTTTCTGTCA GGAGAGCAGC GGGGCGAAGC CCAGGCCCCT 5160 

TTTCACTCCC TCCATCAAGA ATGGGGATCA CAGAGACATT CCTCCGAGCC GGGGAGTTTC 5220 

TTTCCTGCCT TCTTCTTTTT GCTGTTGTTT CTAAACAAGA ATCAGTCTAT CCACAGAGAG 5280 

TCCCACTGCC TCAGGTTCCT ATGGCTGGCC ACTGCACAGA GCTCTCCAGC TCCAAGACCT 5340 



265 



10 



WO 02/086443 

GTTGGTTCCA AGCCCTGGAG 
ATTCCCACAC CTCCACAGTT 
CTCACCGCAQ TOGTOGCACA 
CAGCTCTTGC TAATCAGTGT 
ACCTCAGGTT GCTGGTTGCT 
GGGGCTGGTA GCTCAGGTGG 
ATGTCGTGAC CAACTAGACA 
CAAAAATCTG AAAATGTGAA 
AAAAAAAAAA AAAAAAAA 



Seq ID NO: 194 Protein sequence: 
Protein Accession #: NP_005679.1 



CCAACTGCTG 
CAGTGGCAGG 
GTCTCTCTCT 
CTCACACTGG 
GTGTGGTTTG 
GCGTGGTCAC 
TTCTGTCGCC 
TAAAATTATT 



CTTTTTGAGG 
GCTCAGGATT 
CTCTCTCCCC 
CGTAGAAGTT 
GTGTGTTCCC 
TGCTGTCATC 
TTAGCATGTT 
TTGGATTTTG 



TGGCACTTTT 
TCGTGGGTCT 
TCAAAGTCTG 
TTTGTACTGT 
GCAAACCCCC 
AGTTGAATGG 
TGCTGAACAC 
TAAAAAAAAA 



TCATTTGCCT 
GTTTTCCTTT 
CAACTTTAAG 
AAAGAGACCT 
TTTGTGCTGT 
TCAGCGTTGC 
CTTGTGGAAG 
AAAAAAAAAA 



5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
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MKDIDIGKEY 
LDASMKSQLR 
VAHKKGELSM 
LSIVCLMITQ 
ALNYRTGVRL 
FWAILGMIY 
VLTYIKFIKM 
HMTIiGFDLTA 
NKPASPHIKI 
VLAEQKGKIiL 
KTSLISAIIiG 
CCLRPDLAIIi 
NHIFNSAIRK 
FNNIiLLGETP 
VPWSVYGVYI 
VSDSMKDNPH 
KFFDTTPTGR 
LVILFSVLHI 
LDDNQAPFFL 
GLFQFTVRLA 
RYRENLPLVL 
IGLADLRSKL 
SEVMENGDNF 
MLTIAHRLHT 



11 

I 

IIPSPGYRSV 
ILDEEHPKGK 
EDVWSLSKHE 
LAGFSGPAFM 
RGAILTMAFK 
NVIILGPTGF 
YAWVKAFSQS 
AQAFTWTVF 
EMKNATLAWD 
LDSDERPSPE 
QMTLLEGSIA 
PSSDLTEIGE 
HLKSKTVLFV 
PVEINSKKET 
QAAGGPliAFIi 
MQYYASIYAL 
ILNRFSKDMD 
VSRVLIREtiK 
FTCAMRWLAV 
SETEARFTSV 
KKVSFTIKPK 
SIIPQEPVLF 
SVGERQLLCI 
VLGSDRIMVL 



21 

1 

RERTSTSGTH 
YHHGLSALKP 
SSDVNCRRLE 
VKHLLEYTQA 
KILKLKNIKE 
LGSAVFILFY 
VQKIREEERR 
NSMTFAIjKVT 
SSHSSIQNSP 
EEEGKHIHLG 
I SGTFAYVAQ 
RGANLSGGQR 
THQLQYLVDC 
SGSQKKSQDK 
VIMALFMLNV 
SMAVMLILKA 
EVDVRLPFQA 
RLDNITQSPF 
RLDLISIALI 
ERINHYIKTI* 
EKIGIVGRTG 
SGTVRSNLDP 
ARALLRHCKI 
AQGQWEFDT 



31 
I 

RDREDSKFRR 
IRTTSKHQHP 
RLWQEELNEV 
TESNLQYSLL 
KSLGELINIC 
PAMMFASRLT 
ILEKAGYFQG 
PFSVKSLSEA 
KLTPKMKKDK. 
HLRLQRTLHS 
QAWILNATLR 
QRISLARALY 
DEVIFMKEGC 
GPKTGSVKKE 
GSTAFSTWWL 
IRGWFVKGT 
EMFIQNVILV 
LSHITSSIQG 
TTTGLMIVLM 
SLEAPARIKN 
SGKSSbGMAL 
FNQYTEDQIW 
LILDEATAAM 
PSVLLSNDSS 



41 
I 

TRPLECQDAL 
VDNAGLFSCM 
GPDAASLRRV 
LVLGIiLLTEI 
SNDGQRMFEA 
AYFRRKCVAA 
ITVGVAPIW 
SVAVDRFKSL 
RASRGKKEKV 
IDLEIQEGKIi 
DNILFGKEYD 
SDRSIYILDD 
ITERGTHEEL 
KAVKPEEGQL 
SYWIKQGSGN 
LRASSRLHDE 
FFCVGMIAGV 
LATIHAYNKG 
HGQIPPAYAG 
KAPSPDWPQE 
FRLVEtiSGGC 
DALERTHMKE 
DTETDLLIQE 
RFYAMFAAAE 



51 
I 

ETAARAEGLiS 
TFSWLSSLAR 
VWIFCRTRM 
VRSWSLALTW 
AAVGSLLAGG 
TDERVQKMNE 
VIASWTFSV 
FLMEEVHMIK 
RQLQRTEHQA 
VGICGSVGSG 
EERYNSVLNS 
PLSALDAHVG 
MNLNGDYATI 
VQLEEKGQGS 
TTVTRGNETS 
LFRRILRSPM 
FPWFLVAVGP 
QEFLHRYQEL 
LAISYAVQLT 
GEVTFENAEM 
IKIDGVRISD 
CIAQLPLKXiE 
TIREAFADCT 
NKVAVKG 



Seq ID NO: 195 DNA sequence 
Nucleic Acid Accession #: NM_0 06470 
Coding sequence: 228.. 1922 



GCTGTCCTGA 
CGCCAGCACA 
TTGCAGCAGC 
TGGGCCAAGG 
ATCTAATGGC 
CAGACTCTGG 
TGGGCTCCTC 
AGGGGGATCC 
GAAGAGTGAA 
TGCAGCCGCA 
ACCACAACTG 
ATCAGCAGTG 
TGGATGCAGC 
GGAAACTCAA 
TGGTGTCGGT 
CTGTGAGGAA 
TGAGCCAGGC 
GCAAGCAGGA 
ACTGCAAGTT 
ATAAACTCTC 
TGGAGAACTA 
CTCAAGTGTC 
GGGAACAGTT 
ATCTCCGGCT 
ACCCGGACCT 
ACCTGCACAG 
CCTGCAAAGG 
TCTCCTGGAG 
CCCCACTCAA 
TCCTTTCCTT 
AATTTTCAGA 
TTGTAGATCT 
AGACTCCAGG 
GGTGATTTGT 
TCTGAATGAA 
GTTTGCAGTA 
AGAGCAGTGG 
CTGCCTCAGC 
TTTGTATTTT 



11 
I 

GCCTGAGTAC 
CAGTAATGAG 
TGCAATCATC 
GACAGAAGAA 
TCCAGGGCCA 
GTCACCCAGC 
GGAGAAGCTT 
TGCTGGTGAG 
GGCAGTGAAG 
TCAGGTGAAC 
GCGATACTGC 
CATCTGCCAG 
CCGCAGGGAC 
GTTGAATGAA 
GTCAGAGGTC 
GGCCCAGGCC 
CAACGGTATC 
GCTGGAGAGG 
TAAGAACACT 
GGGCATCCGC 
TAAGAAAAAG 
TGCCGTTGTT 
CCTCCAATAT 
GCAGGAGGAG 
CCCCAGCAGG 
GTACTATTTT 
CATCGACCGG 
CCTCCAATGG 
AGCTGGCCCT 
CTATGGCGTA 
ACCAGTCTAT 
GGGAGAGGAA 
AGCCATATCC 
GGGCAGAAAT 
AACATTCTCC 
ATTCTTTTTT 
CGCGATCTTG 
CTCCCGAGTA 
TAGTAGAGAT 



21 



TCTAGCTGCC 
TGGCCGAGCT 
TAGGCGTGGT 
AGACAGCCTA 
CTGCCCAGGG 
CCAGATTCTG 
GGCAGGGAGA 
GGGAAAGAGG 
TCCTGTCTAA 
ATCAAACTGC 
CCTGCCCACC 
GACTGTTGCC 
AAGGAGGCTG 
AATGCCATCT 
AAAGCGGTGG 
AATGTGATGC 
AAGGCCCACC 
ATGGCGGCCA 
GAAGACATCA 
AAAGTTATCA 
CTCCAGGAGT 
CAGCGCAAAT 
GCGTATGACA 
AACCGCAAGG 
TTCCTGCACT 
GAGGTGGAGA 
AAAGGGGAGG 
AACGGGAAGG 
TTCCGGAGGC 
GAGTATGATA 
GCTGCCTTCT 
CCCGAGAAGC 
CAGACCTTTG 
AACTGCTGAT 
AGCTGCTCTC 
TTTTTTTTGA 
GCTCACTGCA 
GCTGGGATTA 
GGGGTTTCAC 



31 
I 

TTGTCGCCAT 
TCCTCTGGGA 
TCTCTTGTCT 
GGAGCAGAGC 
CCACTGCTCA 
GGTCAGCCAG 
CGGAGGAACA 
TCCTGTGTGA 
CCTGCATGGT 
AAAGCCACCT 
ACAGCCCACT 
AGGAGCACAG 
AACTCCAGTG 
CCAGGCTCCA 
CTGAAATGCA 
TCTTCTTAGA 
TGGAGTACAG 
TCAGCAACAC 
CCTTCCCTAG 
CGGAATCCAC 
TTTCCAAGGA 
ATTGGACTTC 
TCACGTTTGA 
TCACCAACAC 
GGCGGCAGGT 
TCTTCGGGGC 
AGCGCAACAG 
AGTTCACGGC 
TCGGGGTCTA 
CCATGACTCT 
GGCTTTCCAA 
CAGCACCGTC 
CCAGCTACAG 
GGTAGCTGGC 
TTTTGCTCCA 
GACGGAGTCT 
AGCTCCGCCT 
CAGGTGCCTG 
CATGTTGGCC 



41 

I 

CGCATCTGGC 
GGGAGGAAAC 
GACTTGGGCT 
CTCCCAGATG 
GCCCCCAGCC 
CCCAGTGGAA 
GGACAGCGAC 
CTTCTGCCTT 
GAATTACTGT 
GCTGACCGAG 
GTCTGCTTTC 
TGGCCACACC 
CACCCAGTTA 
GGCTAACCAA 
GTTTGGGGAA 
GGAGAAGGAG 
GAGTGCCGAG . 
TGTCCAGTTC 
TGTTTACGTA 
TGTACACTTA 
AGAGGAGTAT 
CAAACCTGAG 
CCCGGACACA 
CACGCCCTGG 
GCTGTCCCAG 
AGGCACCTAT 
TTGCATTTCC 
CTGGTACAGT 
TATCGACTTC 
GGTTCACAAG 
GAAGGAAAAC 
CTTGGGGGTG 
TGATGGGATT 
TTTTGAAATC 
TATGGTGCTG 
CGCACTGTTG 
CCCGAGTTCA 
CCACCACACC 
AGGCAGATCT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



51 
I 

TGCCATCCAG 
AGTTAAAATC 
GCACAGATCC 
GCTGAGTTGG 
CCTCTCAGCC 
GAAGAGGACG 
TCTGCAGAGC 
GATGACACCA 
GAAGAGCACT 
CCAGTGAAGG 
TGCTGCCCTG 
ATAGTCTCCC 
GACTTGGAGC 
AAGTCTGTTC 
CTCCTTGCTG 
CAAGCTGCGC 
ATGGAG AAG A 
TTGGAGG AGT 
GGGCTGAAGG 
ATCCAGTTGC 
GACATCAGAA 
CCCAGCACCA 
GCACACAAGT 
GAGCATCCCT 
CAGAGTCTGT 
GTTGGCCTGA 
GGAAACAACT 
GACATGGAGA 
CCGGGAGGGA 
TTTGCCTGCA 
GCCATCCGGA 
ACTGCTCCCT 
TGCATTTTAG 
CTATGGGGTC 
TTCTCTATGT 
CCCAGGCTGG 
AGCAATTCTC 
CAGCTAATGT 
CAAACTCCTG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1660 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 



266 



WO 02/086443 

ACCTCGTGAT GCACCCACCT CGGCCTCCCA AAGTGCTGGG ATTACATGCG TGAGCCACTG 2400 
CGCCCTGCCT GTTTGTAOTA ATTTTTAGGC ACCAAATCTC CCTCATCTTC TAGTGCCATT 2460 
CTCCTCTCTG TTCAGGTAAA TGTCACACTG TGCCCAGAAT GGATGACCAG GAACCTTAAA 2520 
GAGTGGCTGA AAAGATTGCA GAGTTATCAT AATAAATTGC TAACTTGCGT 



PCT/US02/12476 



Seq ID NO: 196 Protein sequence: 
Protein Accession #: NP_006461 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



1 
I 

MAELDLMAPG 
DSAEQGDPAG 
EPVKDHNWRY 
LDLERKLKLN 
EQAALSQANG 
VGLKDKLSGI 
EPSTREQPLQ 
QQSLYLHRYY 
SDMETPLKAG 
NAIRIVDLGE 



11 
I 

PLPRATAQPP 
EGKEVLCDFC 
CPAHHSPIiSA 
ENAISRLQAN 
IKAHLEYRSA 
RKVITESTVH 
YAYDITFDPD 
FEVEIFGAGT 
PFRRLGVYID 
EPEKPAPSLG 



21 
I 

APLSPDSGSP 
LDDTRRVKAV 
FCCPDQQCIC 
QKSVLVSVSE 
EMEKSKQELE 
LIQLLENYKK 
TAHKYLRLQE 
YVGLTCKGID 
FPGGIIiSFYG 
VTAP 



31 
I 

SPDSGSASPV 
KSCLTCMVNY 
QDCCQEHSGH 
VKAVAEMQPG 
RMAAISNTVQ 
KLQEFSKEEE 
ENRKVTNTTP 
RKGEERNSCI 
VEYDTMTLVH 



Seq ID NO: 197 DNA sequence 
Nucleic Acid Accession #: NM_004316 
Coding sequence: 433-1149 



CCCGAGACCC 
GCGGGTTCAG 
GCGCCAGCGG 
GGAGGAGGGG 
TTGCTCCCAC 
GCTCCCGCTT 
GTCCCCCTCG 
CCGCCGCTGC 
" CAGCCGCAGC 
GCCGCGGCGG 
CAGCAGCAGC 
TCAGGGGGCG 
GAACTGATGC 
CAGCAGCCGG 
AACCTGGGCT 
AGTAAGGTGG 
GACGAGCATG 
CCCAACTACT 
GACGAGGGCT 
TGGTTCTGAG 
ATCGCACAAC 
AAAAGAAAAA 
CCAACCCCAT 
AGCGCTCAGA 
ACCTGAGTCA 
GAGCAGCACA 
GCTCGGGTCC 
GAGTTGGTGT 



11 

I 

GGCGCAAGAG 
CACTGACTTT 
CAGCCTCACA 
AGGGAGGAGG 
TCTAAGAAGT 
CATATTTCCT 
CGGGCCCCGC 
GCATGGAAAG 
CCCAGCAGCC 
CCGCAGCCGC 
AGCAGCAGCA 
GTCACAAGTC 
GCTGCAAACG 
CCGCCGTGGC 
TTGCCACCCT 
AGACACTGCG 
ACGCGGTGAG 
CCAACGACTT 
CTTACGACCC 
GGGCTCGGCC 
CTGCATCTTT 
AAAAGAAGAA 
CGCCAACTAA 
ACAGTATCTT 
ATGCGCAAAA 
CGCGTTATAG 
CTTCACCTCC 
CTTTC 



21 
I 

AGCGCAGCCT 
TGCTGCTGCT 
CGCGAGCGCC 
AGGCGGCGTG 
CTCCOGGGGA 
TTTCTTTCCC 
ACCTCGCGTC 
CTCTGCCAAG 
CTTCCTGCCG 
CGCAGCGGCA 
GCAGGCGCCG 
AGCGCCCAAG 
CCGGCTCAAC 
GCGCCGCAAC 
TCGGGAGCAC 
CTCGGCGGTC 
CGCCGCCTTC 
GAACTCCATG 
GCTCAGCCCC 
TGGTCAGGCC 
AGTGCTTTCT 
GAAGAAGAAA 
GCGAGGCATG 
TGCACTCCAA 
TGCAGCTTGT 
TAACTCCCAT 
CCGCCCTTTC 



31 

I 

TAGTAGGAGA 
TCTGCTTTTT 
ACGCGAGGCT 
CAGGGAGGAG 
TTTTGTATAT 
TCTCTGTTCC 
CCGGATCGCT 
ATGGAGAGCG 
CCCGCAGCCT 
GCGCAGAGCG 
CAGCTGAGAC 
CAAGTCAAGC 
TTCAGCGGCT 
GAGCGCGAGC 
GTCCCCAACG 
GAGTACATCC 
CAGGCAGGCG 
GCCGGCTCGC 
GAGGAGCAGG 
CTGGTGCGAA 
TGTCAGTGGC 
AGAGAAGAAG 
CCTGAGAGAC 
TCATTCACGG 
GTGCAAAAGC 
CACCTCTAAC 
TTAGAGTGCA 



41 

I 

EEEDVGSSEK 
CEEHLQPHQV 
TIVSLDAARR 
ELLAAVRKAQ 
FLEEYCKFKN 
YDIRTQVSAV 
WEHPYPDLPS 
SGNNPSWSLQ 
KFACKFSEPV 



41 
I 

GGAACGCGAG 
TTTTTCTTAG 
CCCGAAGCCA 
AAAAAGCATT 
ATTTTTTAAC 
TGCACCCAAG 
CTGATTCCGC 
GCGGCGCCGG 
GTTTCTTTGC 
CGCAGCAGCA 
CGGCGGCCGA 
GACAGCGCTC 
TTGGCTACAG 
GCAACCGCGT 
GCGCGGCCAA 
GCGCGCTGCA 
TCCTGTCGCC 
CGGTCTCATC 
AGCTTCTCGA 
TGGACTTTGG 
GTTGGGAGGG 
AAAAAAACGA 
ATGGCTTTCA 
AGATATGAAG 
AGTGGGCTCC 
ACGCACAGCT 
GTTCTTAGCC 



SI 
I 

LGRETEEQDS 
NIKLQSHLLT 
DKEAELQCTQ 
ANVMLFIiEEK 
TEDITFPSVY 
VQRKYWTSKP 
RFLHWRQVLS 
WNGKEFTAWY 
YAAFWLSKKE 



51 
I 

ACGCGGCAGA 
AAACAAGAAG 
ACCCGCGAAG 
TTCACCTTTT 
TTCCGTCAGG 
TTCTCTCTGT 
GACTCCTTGG 
CCAGCAGCCC 
CACGGCCGCA 
GCAGCAGCAG 
CGGCCAGCCC 
GTCTTCGCCC 
CCTGCCGCAG 
CAAGTTGGTC 
CAAGAAGATG 
GCAGCTGCTG 
CACCATCTCC 
CTACTCGTCG 
CTTCACCAAC 
AAGCAGGGTG 
GGAGAAAAGG 
AAACAGTCAA 
GAAAACGGGA 
AGCAACTGGG 
TGGCAGAAGG 
GAAAGTTCTT 
CTCTAGAAAC 



Seq ID NO: 198 Protein sequence: 
Protein Accession #: NP_004307 

1 11 



41 



51 



21 31 

I 1 I I I ! 

MESSAKMESG GAGQQPQPQP QQPFLPPAAC FFATAAAAAA AAAAAAAQSA QQQQQQQQQQ 
QQQQAPQLRP AADGQPSGGG HKSAPKQVKR QRSSSPELMR CKRRLNFSGF GYSLPQQQPA 
AVARRNERER NRVKLVNLGF ATLREHVPNG AANKKMSKVE TLRSAVEYIR ALQQLLDEHD 
AVSAAFQAGV LSPTISPNYS NDLNSMAGSP VSSYSSDEGS YDPLSPEEQE LLDFTNWF 

Seq ID NO: 199 DNA sequence 
Nucleic Acid Accession #: NM_007015 
Coding sequence: 1-1005 



ATGACAGAGA 
TGCAGCCCCC 
AAGGTGGGAG 
GCCTTCTACT 
ATCAATGGGA 
TTTAAAATGG 
ACAGGAATTC 
ATTCCTGAGG 
ATGCCAGTCA 
GACAACAGCT 
CTTAAACCAA 
GTTCCAACTA 
CTGAATAATG 



11 
I 

ACTCCGACAA 
CGGCGTACGC 
CCGTGGTCCT 
TCTGGAAGGG 
AACTACAAGA 
GAAGTGGAGC 
GTTTTGCTGG 
TGGGCGCCGT 
AATATGAAGA 
TCTTGAGTTC 
CCTATCCAAA 
CCACAAAAAG 
AAACCAGACC 



21 

I 

AGTTCCCATT 
TACGCTGACG 
CATTTCGGGA 
GAGCGACAGT 
TGGGTCAATG 
TGAAGAAGCA 
AGGAGAGAAG 
GACCAAACAG 
AAATTCTCTT 
TAAGGTGTTA 
AGAAATCCAG 
ACCACACAGT 
CAGTGTTCAA 



31 
I 

GCCCTGGTGG 
GTGAAGCCCT 
GCTGTGCTGC 
CACATTTACA 
GAAA7AGACG 
ATTGCAGTTA 
TGCTACATTA 
AGCATCTCCT 
ATCTGGGTGG 
GAACTCTGCG 
AGGGAAAGAA 
GGACCACGGA 
GAGGACTCAC 



41 
I 

GACCTGATGA 
CCAGCCCCGC 
TGCTCTTTGG 
ATGTCCATTA 
CTGGGAACAA 
ATGATTTCCA 
AAGCGCAAGT 
CCAAACTGGA 
CTGTAGATCA 
GTGACCTTCC 
GAGAAGTGGT 
GCAACCCAGG 
AAGCCTTCAA 



51 
I 

CGTGGAATTC 
GCGGCTGCTC 
GGCCATCGGG 
CACCATGAGT 
CTTGGAGACC 
GAATGGCATC 
GAAGGCTCGT 
AGGCAAGATC 
GCCTGTGAAG 
TATTTTCTGG 
AAGAAAAATT 
CGCTGGAAGA 
TCCTGATAAT 



60 
120 
180 
240 
300 
360 
420 
480 
540 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



267 



10 



WO 02/086443 

CCTTATCATC AGCAGOAAGG GGAAAGCATG ACATTCQACC CTAGACTGQA TCACGAAGGA 840 

ATCTGTTGTA TAGAATGTAG GCGGAGCTAC ACCCACTGCC AGAAGATCTG TGAACCCCTG 900 

GGGGGCTATT ACCCATGGCC TTATAATTAT CAAGGCTGCC GTTCGGCCTG CAGAGTCATC 960 

ATGCCATGTA GCTGGTCGGT GGCCCGTATC TTGGGCATGG TOTGAAATCA CTTCATATAT 1020 

CACGTGCTGT AAAATAAGAA CTAGCTGAAG AGACAACCAA AOAAGCATTA AGGCAGGTTG 1080 

ATGCTGATGG GACCATAAAA TATTTTTACA CGCAGCCTGA GCGGTTATTC TTGACACTCT 1140 

TAACAGAATT TTTTTAATCG TTTTCCAGAA CTTTAGTATA TGCAAATGCA CTGAAAGGGT 1200 

AGTTCAAGTC TAAAATGCCA TAACCCCGTT ATTTGTTATT TTTTATTTGC ATTGATTTGC 1260 

CATAAGTCTT CCCTTGCTTG CATCTTCCAA AGCTATTTOG AAATAAACAC GAAAATTTAC 1320 
AGTTTGCC 



PCT/US02/12476 
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Seq ID NO: 200 Protein sequence: 
Protein Accession fl: NP_008946 



MTENSDKVPI 
AFYFWKGSDS 
TGIRFAGGEK 
DMSFLSSKVL 
LNNETRPSVQ 
GGYYPWPYNY 



11 
I 

ALVGPDDVEP 
HIYNVHYTMS 
CYIKAQVKAR 
ELCGDliPIFW 
EDSQAFNPDN 
QGCRSACRVI 



21 31 41 51 

1)11 
CSPPAYATLT VKPSSPARLL KVGAWLISG AVLLLFGAIG 
INGKLQDGSM EIDAGNNLET FKMGSGAEEA IAVNDFQNGI 
IPEVGAVTKQ SISSKIiEGKI MPVKYEENSL IWVAVDQPVK 
LKPTYPKEIQ RERREWRKI VPTTTKRPHS GPRSNPGAGR 
PYHQQEGESM TFDPRLDHEG ICCIECRRSY THCQKICEPL 
MPCSWWVARI LGMV 



Seq ID NO: 201 DNA sequence 
Nucleic Acid Accession #: NMJ)00728. 
Coding sequence: 112.. 495 



i 

GTAATAAGAG 
GTCGACCGGC 
CGGAAGTTCT 
CAGGCGGCGC 
GAGGACGCGC 
GAGCTGAAGC 
AACACTGCCA 
GTGAAGAGCA 
GACCTTCAAG 
CATATCCTTA 
AAGGAGGCAC 
TGGAAGAAGA 
GAGAATAATT 
GGAAACTAAT 
GGTTATTTGG 
ACTGTACCAC 
GTATGTAGCA 
AGCATCTATT 
AAATTTTTGT 
TGGATGCAAG 
TTGCTTTTTC 
TCCAATTCAT 
TTGCCTAACT 
TATAGTTTTA 
TGAGAGGTGT 
TTTGTTAAAA 
CTGATCATAT 
ACCATTCTTT 
ACCAGATAAT 
CTAAAATATT 
TTCTGATGAG 
CATATTAATA 
GTTTTCTCTG 
ATATCTTGTT 
ATTTTTGTTT 
AAAAAAAAAA 



11. 
I 

CGGGGTCTCC 
CGCTCGCGCT 
CCCCCTTCCT 
CATTCAGGTC 
GCCTCCTGCT 
AGGAGCAGGA 
CCTGTGTGAC 
ACTTCGTGCC 
CCTGAGCAGA 
TAAGAGATTC 
AAGCCAAGGA 
GCAGCCCTGC 
TCTGTTGTTT 
ACAATACATT 
AAAGTGTGTA 
TTCGCCTTCT 
GTATCTCATT 
TTACCATATG 
TGGCTTGCTT 
ATTGTTTTCA 
ATTTTCTTAG 
CTTTTTTTTT 
AAGGTCCCAA 
TATTTTATAT 
AGGTTGAAAT 
AGACTGTTAT 
TTGTGTGGGT 
TGCCAATGTC 
GTGGGTCTAC 
TTCTACATCT 
ATTTTTAATG 
ATATTAAGTC 
TTTTTTTTTT 
AGATTTTTAA 
TTAATTGTTC 
AAAAAAAAAA 



21 

I 

GCGGGGAAGG 
GCCCTGAAAC 
GGCTCTCAGT 
TGCCCTGGAG 
GGCTGCACTG 
GACACAGGGC 
TCATCGGCTG 
CACCAATGTG 
TGAATGACTC 
ACTCAGAAGA 
AGTCTGTGTC 
TGACACCTAG 
TAAGCCACAA 
TTCATTTATT 
TTTAACTCTG 
TGCCAGCCAC 
GCTGTTTTAA 
TTTATCACCT 
GCTTTATTAG 
GATATATAGT 
CAGTGTCTCT 
CTTTTATGTA 
GGTCACAATA 
GTAGATTAGT 
TCATACCTGT 
TTCACCATTT 
ATATTTCTGG 
ATACTGCCTT 
CAACATTGTT 
TTTATACATT 
GGATTGTGTT 
GTTCAATTCA 
TTTAACAGTG 
CTATTTTATT 
ATTGCTAGTA 
AAAAAAAAAA 



31 



41 



CGCCCACAGC AGGTGTGGTG 
TCTAGTCGCC AGAGAGGCGG 
ATCTTGGTCC TGTACCAGGC 
AGCAGCCCAG ACCCGGCCAC 
GTGCAGGACT ATGTGCAGAT 
TCCAGCTCCG CTGCCCAGAA 
GCAGGCTTGC TGAGCAGATC 
GGTTCCAAAG CCTTTGGCAG 
CAGGAAGAAG GTGTGTCCTA 
CACATGTGGA GAAGGTGACA 
TACCAGAAGC CAGAATCACA 
AGTTTGGACT TCCAGCTTCC 
AGTTTGTGGT AATTTGTTAT 
TTGGGTAAAT GCCTTGGAGT 
TAAGAAACTG CCAAACTATT 
ATATGAGAGC TCTAGTATTT 
TTTGTATTTC CCCAATGACT 
TTATTGAAGG GTCTGTTTAA 
TGTTGAGTTT TTAGAGCTCT 
TTGGAAACTT CCTTCCCCTG 
CACAGAGAAA AAGTTGTAAT 
TTGTGCTTTT AGTTCATGTC 
ACCTTATTCT ATACTTTCTT 
GATCTATTTT GAGTTAATTT 
GAATATAGAT ACCCAATTGT 
AATTGCCCCT GCACCTTTGT 
GTTCTCAATT CTGTCTCATT 
GATTAGTGTA GTGTTAAAGT 
CATTCTTGTT CAAAAAGATT 
TTAGAATCAG TGTGTTACTA 
AAATCAGTGG GTTAATTTTG 
TGAACACAAT ACATGTTTTC 
TTCTCAGTTT TCAACAGAAA 
TTTTGGTGCT AATGTAAATG 
GATAGAAATA CAATATTTAA 



51 
I 

TTCATCCCGG 
CATGGGTTTC 
GGGCAGCCTC 
ACTCAGTAAA 
GAAGGCCAGT 
GAGAGCCTGC 
AGGGGGCATG 
GCGCCGCAGG 
AATCCAATGA 
TGACAGAGGC 
GAACAGTCTC 
AGAACTGTGA 
GACAGCCCTA 
GGGATTGCTG 
TTCTGAAGTG 
CCACAAATAG 
AATGACGTTG 
ATCTTCTGCT 
TTATATGTTG 
AATCTGCGGA 
TTGAATAAGA 
TAAGAACTCT 
GTAAAAGTTT 
TTGTATAAGG 
TTCAGTGCCA 
CAAAAAGCAA 
GATTGATTTG 
GAATCTCAAA 
TTAGCTACAT 
TCTACAAAAT 
GGAGAATTAG 
ACTTATTTAG 
TATTCTACAC 
GTACTTAAAC 
AATATTAGGA 



Seq ID NO: 202 Protein sequence: 
Protein Accession ft: NP_000719.1 



11 



21 



31 



41 



51 
I 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1S0O 
1560 
1620 
1660 
1740 
1800 
1860 
1920 
1980 
2040 
2100 



MGFRKFSPFL ALSILVLYQA GSLQAAPFRS ALESSPDPAT LSKEDARLLL AALVQDYVQM 
KASELKQEQE TQGSSSAAQK RACNTATCVT HR1AGLLSRS GGMVKSNFVP TNVGSKAFGR 
RRRDLQA 

Seq ID NO: 203 DNA sequence 
Nucleic Acid Accession #: NMJJ01741 
Coding sequence: 71.. 496 



60 
120 



CTCTGGCTGG 
GAGAGGTGTC 
GTTGCAGGCA 
AGACCCGGCC 
CTATGTGCAG 



11 

I 

ACGCCGCCGC 
ATGGGCTTCC 
GGCAGCCTCC 
ACGCTCAGTG 
ATGAAGGCCA 



21 



31 



41 
I 



51 



CGCCGCTGCC ACCGCCTCTG ATCCAAGCCA CCTCCCGCCA 
AAAAGTTCTC CCCCTTCCTO GCTCTCAGCA TCTTGGTCCT 
ATGCAGCACC ATTCAGGTCT GCCCTGGAGA GCAGCCCAGC 
AGGACGAAGC GCGCCTCCTG CTGGCTGCAC TGGTGCAGGA 
GTGAGCTGGA GCAGGAGCAA GAGAGAGAGG GCTCCAGCCT 



60 
120 
160 
240 
300 



268 



10 

15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

GGACAGCCCC AGATCTAAGC GGTGCGGTAA TCTGAGTACT TGCATGCTGG GCACATACAC 360 

GCAGGACTTC AACAAGTTTC ACACGTTCCC CCAAACTGCA ATTGGGGTTG GAGCACCTGG 420 

AAAGAAAAGG GATATGTCCA GCGACTTGGA GAGAGACCAT CGCCCTCATG TTAGCATGCC 480 

CCAGAATGCC AACTAAACTC CTCCCTTTCC TTCCTAATTT CCCTTCTTGC ATCCTTCCTA 540 

TAACTTGATG CA7GTGGTTT GGTTCCTCTC TGGTGGCTCT TTGGGCTGGT ATTGGTGGCT 600 

TTCCTTGTGG CAGAGGATGT CTCAAACTTC AGATGGGAGG AAAGAGAGCA GGACTCACAG G60 

GTTGGAAGAG AATCACCTGG GAAAATACCA GAAAATGAGG GCCGCTTTGA GTCCCCCAGA 720 

GATGTCATCA GAGCTCCTCT GTCCTGCTTC TGAATGTGCT GATCATTTGA GGAATAAAAT 780 
TATTTTTCCC C 



PCT/US02/12476 



Seq ID NO: 204 Protein sequence: 
Protein Accession §: NP_001732 



11 



21 



31 
I 



41 



51 



MGFQKFSPFL ALSILVLLQA GSLHAAPFRS ALESSPADPA TLSEDEARLL LAALVQDYVQ 
MKASELBQBQ EREGSSLDSP RSKRCGNLST CMLGTYTQDF NKFHTFPQTA IGVGAPGKKR 
DMSSDLERDH RPKVSMPQNA N 

Seq ID NO: 205 DNA sequence 
Nucleic Acid Accession ft: NM_O0S361 
Coding sequence: 1-945 



ATGCCTCTTG 
GAGGCCCTGG 
TCCTCTTCTA 
CCTCCCCACA 
AGACAATCCG 
CTGGAGTCCG 
CTCCTCAAGT 
AGAAATTGCC 
GTCTTTGGCA 
TGCCTGGGCC 
CTCCTGATAA 
ATCTGGGAGG 
CATCCCAGGA 
GTGCCCGGCA 
ACCAGCTATG 
TACCCACCCC 



11 
I 

AGCAGAGGAG 
GCCTGGTGGG 
CTCTAGTGGA 
GTCCTCAGGG 
ATGAGGGCTC 
AGTTCCAAGC 
ATCGAGCCAG 
AGGACTTCTT 
TCGAGGTGGT 
TCTCCTACGA 
TCGTCCTGGC 
AGCTGAGTAT 
AGCTGCTCAT 
GTGATCCTGC 
TGAAAGTCCT 
TGCATGAACG 



21 
I 

TCAGCACTGC 
TGCGCAGGCT 
AGTTACCCTG 
AGCCTCCAGC 
CAGCAACCAA 
AGCAATCAGT 
GGAGCCGGTC 
TCCCGTGATC 
GGAAGTGGTC 
TGGCCTGCTG 
CATAATCGCA 
GTTGGAGGTG 
GCAAGAT CTG 
ATGCTACGAG 
GCACCATACA 
GGCTTTGAGA 



31 
I 

AAGCCTGAAG 
CCTGCTACTG 
GGGGAGGTGC 
TTCTCGACTA 
GAAGAGGAGG 
AGGAAGATGG 
ACAAAGGCAG 
TTCAGCAAAG 
CCCATCAGCC 
GGCGACAATC 
ATAGAGGGCG 
TTTGAGGGGA 
GTGCAGGAAA 
TTCCTGTGGG 
CTAAAGATCG 
GAGGGAGAAG 



41 
I 

AAGGCCTTGA 
AGGAGCAGCA 
CTGCTGCCGA 
CCATCAACTA 
GGCCAAGAAT 
TTGAGTTGGT 
AAATGCTGGA 
CCTCCGAGTA 
ACTTGTACAT 
AGGTCATGCC 
ACTGTGCCCC 
GGGAGGACAG 
ACTACCTGGA 
GTCCAAGGGC 
GTGGAGAACC 
AGTGA 



51 
I 

GGCCCGAGGA 
GACCGCTTCT 
CTCACCGAGT 
CACTCTTTGG 
GTTTCCCGAC 
TCATTTTCTG 
GAGTGTCCTC 
CTTGCAGCTG 
CCTTGTCACC 
CAAGACAGGC 
TGAGGAGAAA 
TGTCTTCGCA 
GTACCGGCAG 
CCTCATTGAA 
TCACATTTCC 



Seq ID NO: 206 Protein sequence: 
Protein Accession Jh NP 005352 



MPLEQRSQHC 
PPHSPQGASS 
LLKYRAREPV 
CLGLSYDGLL 
HPRKLLMQDL 
YPPLHERALR 



11 21 

I I 
KPEEGLEARG EALGLVGAQA 
FSTTINYTLW RQSDEGSSNQ 
TKAEMLESVL RNCQDFFPVI 
GDNQVMPKTG LLIIVLAIIA 
VQENYLEYRQ VPGSDPACYE 



31 
I 

PATEEQQTAS 
EEEGPRMFPD 
FSKASEYLQL 
IEGDCAPEEK 
FLWGPRALIE 



41 
I 

SSSTLVEVTL 
LESEFQAAIS 
VFGIEWEW 
IWEELSMLEV 
TSYVKVLHHT 



51 
I 

GEVPAADSPS 
RKMVELVHFL 
PISHLYILVT 
FEGREDSVFA 
LKIGGEPHIS 



Seq ID NO: 207 DNA sequence 
Nucleic Acid Accession «: NMJ521115 
Coding sequence: 743-2893 



1 
I 

AAAGGAAGGG 
GGCACCGCCC 
CCCAAACTAA 
CCCTTTGGGT 
GCACCCTGAA 
GGGCGAGCTG 
ACCGCTGCTT 
TTCGCTCAAG 
CACTGTCCAA 
CACGGAGAAG 
AGAAGTGCCC 
GCAAATCTCC 
ACCCGGGGAG 
GGCCCTGATG 
GACCACTACC 
CTGCAGTGTG 
GCCCCTCAAC 
GGAGCTCCAG 
GGACGGCCCT 
CCGAAGCCCC 
GACCTTCCAG 
CTCTGGGGAT 
CCTGGGCTAT 
CTGGAGCAGC 
CATCGGCCGC 
CTGGACGATT 



11 
I 

AGGGAGGGAG 
TTAGGAGGGC 
CTGGTGTCTT 
CCTTACCTCC 
GAGAGAGTGG 
GTGCTGGATG 
CCAGAGGAGG 
CAGGTGAACT 
AGGGCAGGGT 
CCTGGCCCAC 
CTTTGGCTGG 
CCCTTCACTT 
CCTGGGCCTG 
GACAAAGGTG 
TCCACCATTA 
AGCTTCTCCA 
AACTTTCTGG 
GTGAAGAGTG 
ACCCTGACCG 
ACCAACACCA 
CTTCACTACC 
GTCACGGTGA 
GAGCTCCAGG 
CAGGAGCCCA 
GTCCTCTCCC 
GAAGCTCCAG 



21 
I 

AAAGGAGAAG 
CACCCTCAGA 
TTCTCCTCTT 
TGCCCTCAGG 
TAACAGCGCC 
GGACCGCACC 
CCCGCCCCAA 
CTGCCAGGAA 
CCCAGCCAGC 
CGGGGGACCC 
ACCGAAAGGA 
CGCAGCCCTA 
ACATGGCCCA 
AGAATGAGCT 
TCACCACCAC 
ATCCTGAGGG 
AGTGCACATA 
TGAACCTGTC 
TCCTGGCCAA 
TCTCCGTCTA 
AGGCCTTCAT 
TGGACCTGCA 
GCGCTAAGAT 
TCTGCTCAGC 
CAAGTTACCC 
AGGGCCAGAA 



31 
I 

TTGGTTTAGA 
GTCTGACAGC 
CCAAGATGCT 
AGCCCCGGAG 
CCCCAGTTCC 
CTCTGCACAT 
GCACGCCTTG 
GCAGCTGAGG 
GTCCCAGGGC 
GGACCCCATC 
GAGTGCGGTC 
TGTGGCCCAC 
GGAGGCCCCC 
GACTGGGTCA 
GGTCATCACC 
GTACATTGAC 
CAACGTGACA 
CGATGGGGAA 
CCAGACACTC 
CTTCCGGACC 
GCTGAGCTGC 
CTCAGGTGGG 
GCTGACATGC 
TCCTTGTGGA 
TGAAAACACA 
GCTGCACCTG 



41 

I 

GGCCAGCCGG 
AGGTGAAGGT 
CTTCCCGAGG 
AGAGGCAGTC 
TCACAGTCGG 
CACGACATCC 
CCCCCCAAGA 
CCCAAGGCCA 
CTAGATCTCC 
GTGGCCTCCG 
CCTACAACAC 
ACACTCCCCC 
CAGGAGGACA 
GCCTCAGAGG 
ACCGAGCAGG 
TCCAGCGACT 
GTCTACACTG 
CTGCTCTCCA 
CTGGTGGAGG 
TTCCAGGACG 
AACTTTCCCC 
GTGGCCCACT 
ATCAATGCCT 
GGGGCAGTGC 
AATGGGAGCC 
CACTTTGAGA 



51 
i 

ACGAGCTTTG 
CCTAAATCTC 
GAGATGCTAG 
CTGGCAAAGA 
CGGAAGTGCT 
CAGCCCTGTC 
AGAAACTGCC 
CCTCCGCAGC 
TCTCCTCCTC 
AGGAGGCATC 
CCGCACCCCT 
AGAGGCCAGA 
CCAGCCCCAT 
AGAGCCAGGA 
CACCAGCTCT 
ACCCACTGCT 
GCTATGGGGT 
TCCGCGGGGT 
GGCAGGTAAT 
ACGGCCTTGG 
GCCGGCCTGA 
TTCACTGCCA 
CCAAGCCGCA 
ACAATGCCAC 
AATTCTGCAT 
GGCTGTTGCT 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



269 



WO 02/086443 

GCATGACAAG GACAGGATGA CGGTTCACAG 
CGACTCCCTT CAAACCGAGA GTGTCCCTTT 
CCGCATCGAG TTCACGTCCG ACCAGGCCCG 
AGOGTTTGAG AAAGGCCACT GCTATGAGCC 
CG ACCCGACC TATAACATTG GGACTATAGT 
GGAGCAGGGC CCGGCCATCA TCGAATGCAT 
AGAGCCCCTG TGCAGAGCCA TGTGTGGTGG 
GTCCCCAAAC TGGCCCGAGC CCTACGTGGA 
GGGAGAAGAG AAACGGATCT TCTTAGATAT 
CTTGACCATC TACGATGGCG ACGAGGTCAT 
CAGTGGCCCC CAGAAACTGT ACTCCTCCAC 
CCCTGCTGGC CTCATCTTTG GAAAGGGCCA 
AAGGAATGAC TCCTGCTCGG ATTTACCCGA 
CACGGAGTTG GTGOGGGGAG CCAGAATCAC 
GGGGAGTGAC ACCCTCACCT GCCAGTGGGA 
TGAGAAAATT ATGTACTGCA CCGACCCCGG 
GGATCCTGTG CTGCTGGTGG GGACCACCAT 
TGAAGGGAGT TCTCTTCTGA CCTGCTACAG 
TCGCCTGCCC CACTGCGTTT CAGAAGCGGC 
GGCCCTGGCT ATCTTCATCC CGGTCCTCAT 
TTACATCACA AGATGTCGCT ACTATTCCAA 
CTACAGCCAG ATCACCGTGG AAACCGAGTT 
CCAAAAGGTT TAGGGTTTCA TTTAAAAAGA 
AACCCCAATT TCCCCGAGAC ATTTATCCAA 
AAAGGCGGCT GTTTTTTGGT TAAACTTTTT 
TTTATAAATT TTAAAAGTG 



CGGGCAGACC AACAAGTCAG CTCTTCTCTA 1620 

TGAGGGCCTG CTGAGCGAAG GCAACACCAT 1680 

GGCGGCCTCC ACCTTCAACA TCCGATTTGA 1740 

CTACATCCAG AATGGGAACT TCACTACATC 1800 

GGAGTTCACC TGCGACCCCG GCCACTCCCT 1860 

CAATGTGOGG GACCCATACT GGAATGACAC 1920 

GGAGCTCTCT GCTGTGGCTG GGGTGGTATT 1980 

AGGTGAAGAT TGTATCTGGA AGATCCACGT 2040 

CCAGTTCCTG AATCTGAGCA ACAGTGACAT 2100 

GCCCCACATC TTGGGGCAGT ACCTTGGGAA 2160 

GCCAGACTTA ACCATCCAGT TCCATTCGGA 2220 

GGGATTTATC ATGAACTACA TAGAGGTATC 2280 

GATCCAGAAT GGCTGGAAAA CCACTTCTCA 2340 

CTACCAGTGT GACCCCGGCT ATGACATCGT 2400 

CCTCAGCTGG AGCAGCGACC CCCCATTTTG 2460 

AGAGGTGGAT CACTCGACCC GCTTAATTTC 2520 

CCAATACACC TGCAACCCCG GTTTTGTGCT 2580 

CCGTGAAACA GGGACTCCCA TCTGGACGTC 2640 

AGCAGAGACG TCGCTGGAAG GGGGGAACAT 2700 

CATCTCCTTA CTGCTGGGAG GAGCCTACAT 2760 

CCTCCGCCTG CCTCTGATGT ACTCCCACCC 2820 

TGACAACCCC ATTTACGAGA CAGGGGGAAC 2880 

GGTACCCTTT AAAAAGGGGC TTGTGAACTC 2940 

AGGCCCTGGG GGCCTTGATT TAAACCCCCA 3000 

AACAAAGGGT TACGGGTTTT TTCCCCGGAT 3060 



Seq ID NO j 208 Protein sequence: 
Protein Accession #: NP_066938 

1 11 ' 21 31 41 51 

I I I I I I 

MAQEAPQBDT SPMALMDKGE NELTGSASEE SQETTTSTII TTTVITTEQA PALCSVSPSN 60 

PEGYIDSSDY PLLPLNNFLE CTYNVTVYTG YGVELQVKSV NLSDGELLSI RGVDGPTLTV 120 

LANQTLLVEG QVIRSPTNTI SVYFRTFQDD GLGTFQIjHYQ AFMLSCNFPR RPDSGDVTVM 180 

DLHSGGVAHF HCHLGYELQG AKMLTCINAS KPHWSSQEPI CSAPCGGAVH NATIGRVLSP 240 

SYPENTNGSQ FCIWTIEAPE GQKLHLHFER LLLHDKDRMT VHSGQTNKSA LLYDSLQTES 300 

VPFEGLLSEG NTIRIEFTSD QARAASTFNI RFEAFEKGHC YEPYIQNGNF TTSDPTYNIG 360 

TIVEFTCDPG HSLEQGPAII ECINVRDPYW NDTEPLCRAM CGGELSAVAG WLSPNWPEP 420 

YVEGEDCIWK IHVGEEKRIF LDIQFLNLSN SDILTIYDGD EVMPHILGQY LGNSGPQKLY 480 

SSTPDLTIQF HSDPAGLIFG KGQGFIMNYI EVSRNDSCSD LPEIQNGWKT TSHTELVRGA 540 

RITYQCDPGY DIVGSDTLTC QWDLSWSSDP PFCEKIMYCT DPGEVDHSTR LISDPVLLVG 600 

TTIQYTCNPG FVLEGSSLLT CYSRETGTPI WTSRLPHCVS EAAAETSLEG GNMALAIFIP 660 
VLIISLLLGG AYIYITRCRY YSNLRLPLMY SHPYSQITVE TEFDNPIYET GGTQKV 

Seq ID NO: 209 DNA sequence 

Nucleic Acid Accession ft: NM_00 1327.1 

Coding sequence: 89-631 

I 11 21 31 41 51 

II I I I I 

AGCAGGGGGC GCTGTGTGTA CCGAGAATAC GAGAATACCT CGTGGGCCCT GACCTTCTCT 60 

CTGAGAGCCG GGCAGAGGCT CCGGAGCCAT GCAGGCCGAA GGCCGGGGCA CAGGGGGTTC 120 

GACGGGCGAT GCTGATGGCC CAGGAGGCCC TGGCATTCCT GATGGCCCAG GGGGCAATGC 180 

TGGCGGCCCA GGAGAGGCGG GTGCCACGGG CGGCAGAGGT CCCCGGGGCG CAGGGGCAGC 240 

AAGGGCCTCG GGGCCGGGAG GAGGCGCCCC GCGGGGTCCG CATGGCGGCG CGGCTTCAGG 300 

GCTGAATGGA TGCTGCAGAT GCGGGGCCAG GGGGCCGGAG AGCCGCCTGC TTGAGTTCTA 360 

CCTCGCCATG CCTTTCGCGA CACCCATGGA AGCAGAGCTG GCCCGCAGGA GCCTGGCCCA 420 

GGATGCCCCA CCGCTTCCCG TGCCAGGGGT GCTTCTGAAG GAGTTCACTG TGTCCGGCAA 480 

CATACTGACT ATCCGACTGA CTGCTGCAGA CCACCGCCAA CTGCAGCTCT CCATCAGCTC 540 

CTGTCTCCAG CAGCTTTCCC TGTTGATGTG GATCACGCAG TGCTTTCTGC CCGTGTTTTT 600 

GGCTCAGCCT CCCTCAGGGC AGAGGCGCTA AGCCCAGCCT GGCGCCCCTT CCTAGGTCAT 660 

GCCTCCTCCC CTAGGGAATG GTCCCAGCAC GAGTGGCCAG TTCATTGTGG GGGCCTGATT 720 
GTTTGTCGCT GGAGGAGGAC GGCTTACATG TTTGTTTCTG TAGAAAATAA AACTGAGCTA 

Seq ID NO: 210 Protein sequence: 
Protein Accession #: NP_001318.1 

1 11 21 31 41 51 

I 1 I I I I 

MQAEGRGTGG STGDADGPGG PGIPDGPGGN AGGPGEAGAT GGRGPRGAGA ARASGPGGGA 60 
PRGPHGGAAS GLNGCCRCGA RGPESRLLEF YLAMPFATPM EAELARRSLA QDAPPLPVPG 120 
VLLKEFTVSG NILTIRLTAA DHRQLQLSIS SCLQQLSLLM WITQCFLPVF LAQPPSGQRR 



Seq ID NO: 211 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 52-459 

1 11 21 31 41 51 

I I I I I I 

CCTCGTGGGC CCTGACCTTC TCTCTGAGAG CCGGGCAGAG GCTCCGGAGC CATGCAGGCC 60 

GAAGGCCAGG GCACAGGGGG TTCGACGGGC GATGCTGATG GCCCAGGAGG CCCTGGCATT 120 

CCTGATGGCC CAGGGGGCAA TGCTGGCGGC CCAGGAGAGG CGGGTGCCAC GGGCGGCAGA 180 

GGTCCCCGGG GCGCAGGGGC AGCAAGGGCC TCGGGGCCGA GAGGAGGCGC CCCGCGGGGT 240 

CCGCATGGCG GTGCCGCTTC TGCGCAGGAT GGAAGGTGCC CCTGCGGGGC CAGGAGGCCG 300 



270 



WO 02/086443 

OACAGCOGCC TGCTTCAGTT CCGACTGACT 
ATCAGCTCCT GTCTCCAGCA GCTTTCCCTG 
GTGTTTTTQG CTCAGGCTCC CTCAGGGCAG 
TAGGTCATGC CTCCTCCCCT AGGGAATGGT 
GCCTGATTGT TTGTCGCTGG AGGAGGACGG 
CTGAGCTA 



PCT/US02/12476 



GCTGCAGACC ACOGCCAACT GCAGCTCTCC 360 

TTGATGTGGA TCACGCAGTG CTTTCTGCCC 420 

AGGCGCTAAG CCCAGCCTGG CGCCCCTTCC 480 

CCCAGCACGA GTGGCCAGTT CATTGTGGGG 540 

CTTACATGTT TGTTTCTGTA GAAAATAAAG 600 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO: 212 Protein sequence: 
Protein Accession #: Eos sequence 



11 



21 



31 41 51 

I i i I I I 

MQAEGQGTGG STGDADGPGG PGIPDGPGGN AGGPGEAGAT GGRGPRGAGA ARASGPRGGA 
PRGPHGGAAS AQDGRCPCGA RRPDSRLLQF RLTAADHRQL QLSISSCLQQ LSLLMWITQC 
FLPVFLAQAP SGQRR 

Seq ID NO: 213 DNA sequence 
Nucleic Acid Accession fl: NMJ)00555 
Coding sequence: 416.. 1498 



1 
I 

CTTATTTTTT 
AGCACAAAGA 
TCTGGGGGGA 
ACTCCCCCTT 
AACCTTGGGT 
TACAACTGTT 
CCCGAAGTTC 
ACTTGATTTT 
GATGAATGGG 
CTTGCAGGCA 
CCGCTACTTC 
CTTGCTGGCT 
TTACATTTAC 
GGAAAGCTAT 
CAATCCCAAC 
GGCTAGCAGC 
TACCATCATC 
GACAGCCCAC 
CGGGGTTGTC 
CTTTGGTGAT 
TGATTTTTCT 
TGGCCCAAAG 
CCGAAGCAAG 
CAAGTCTAAG 
GGACCTGTAC 
GAGGGGAGAG 
TCTGCTCAAG 
TATTTTGAAA 
CACTGATCCA 
TCCAGGGATG 
TAAATTTGCC 
ATACAGCAAT 
ACAAACACAA 
GCAAGGCAGC 
CACCAATTCT 
TGCTTAGGAT 
ACATTTCCGA 
TTTGGAATCA 
TCTAGAAGGC 
TGTTCTATTG 
GGTCTGACAC 
CATGTCTGTA 
TTTTGTCTAA 
ATGTTATTTA 
AATAGCAGAG 
CATTAATAAT 
ATATTTTAAG 
CTTTCACTTG 
TAGTTTTCAG 
GCCCATAATG 
GCTTTNACCA 
CACAGCATCC 
CTAAATGGAA 
GTGTGTGTGT 
GTAATGGATT 
GTGGTGGTGG 
ACATTTTCGG 
ACCCCAAATG 
CCATTAATAA 
GTGCCTCCCA 
GGGCATAAAG 
CACCCATAGT 
TGGAGGCTGG 
ACACTAGCTC 
GGATATATTT 



11 

t 

ATGAATGTCG 
CACTGGCTGT 
GGGGATGCAC 
CATAGTCATT 
AGCTCCTTCT 
AGTCATGTGG 
CAATTTGATA 
GGACACTTTG 
TTGCCTAGCC 
CTGAGTAATG 
AAGGGGATTG 
GACCTGACGC 
ACCATTGATG 
GTCTGTTCCT 
TGGTCTGTCA 
AACAGTGCAC 
CGCAGTGGGG 
TCTTTTGAGC 
AAAAAACTCT 
GATGATGTGT 
CTGGATGAAA 
GCATCCCCAA 
TCTCCAGCTG 
CAGTCTCCCA 
CTGCCTCTGT 
TGCTCAGAGT 
TGTCCAACAG 
AACACATTGT 
CAGTTACCAA 
CAAAATGTGC 
CCGTTTAAAT 
TAAAAAGTTT 
GTGCCCCTTT 
TCCCCAGCCT 
GGCTGTCAAT 
CCTGGTGCTG 
AGAGTTTATA 
ATAAGCAATT 
TGTCTAACAT 
AATGCCTTGT 
CTTTCAGTCT 
TTTCAGGAGC 
AAAACACATG 
GAAATTATGC 
GCCAATTCAA 
TTCAATGTGG 
CAACTCTTTT 
TCTTTAACAT 
TCTTTTGAGA 
GCAAAAACAA 
AATATAAAAA 
AAACCAAGCT 
TGAGCTTGCT 
GTGTGCATCT 
GGTGGCAACT 
GGTATCTCAA 
TTCAAGAAAA 
ATGAGGATCT 
GCCCATTTTA 
NAACATTTTG 
AATGGTGGGA 
NTCACTTTAG 
TAAAGAGCAG 
TNTGAGTATT 
TCTTTAGGAT 



21 

I 

GATAGCTGCA 
TCCCTGGAGG 
ACATTAGAGT 
GTACTGAAAT 
GTTCTCTTCA 
GCATGTGTGA 
GGAGCCACTG 
ACGAAAGAGA 
CCACTCACAG 
AGAAGAAAGC 
TGTACGCTGT 
GATCTCTGTC 
GATCCAGGAA 
CAGACAACTT 
ACGTAAAAAC 
AGGCCAGGGA 
TGAAGCCTCG 
AAGTCCTCAC 
ACACTCTGGA 
TTATTGCCTG 
ATGAATGCCG 
CACCTCAGAA 
ACTCAGCAAA 
TCTCTACGCC 
CCTTGGATGA 
CCAGAGTACA 
GGCTATTGGT 
AATATGTTGG 
TTATGAGAGA 
TAGTCCATGA 
TTGCCCAAAC 
GTGTGGGGAA 
TCTCTGGATC 
CACTCTTCAC 
GGGGAGAAAT 
GGTTAGCTAA 
AAGCACAGTG 
GATAATAGTT 
ACCACATGAT 
TAACAGCCAA 
CTTTTTATAG 
AAACTCTTCA 
AAGAAAATTT 
TGTCACTGCC 
TAGAATCAGT 
ACCAGACATT 
TATCTATAAT 
TAGAAAGGAT 
TACAGGTTTA 
CTAATTTTAA 
TTCCCTTATT 

GTGTGTGTGT 
GCAGCTGCTT 
GGGTGGCACT 
ATGCCCCTAG 
GTGAGATGAT 
CTTTTTGCCC 
CTAANCCCCT 
TAGTTAATTG 
GGCCTGATTT 
GTCTCATTTA 
GACCAGAGGA 
TCCTTGATTG 
AACCTTTGAA 



31 
I 

CCAGCTTGGT 
CTGTCCCTTT 
AGGAAAGAGG 
GCAAAGACTG 
AGGGGAATTT 
GGAAACAGAT 
TCAGTCTCTG 
TAAGACATCC 
CGCCCACTGT 
CAAGAAGGTA 
GTCCTCTGAC 
TGACAACATC 
GATCGGAAGC 
CTTTAAAAAG 
ATCTGCCAAT 
GAACAAGGAC 
GAAGGCTGTG 
TGATATCACA 
TGGAAAACAG 
TGGTCCTGAA 
AGTCATGAAG 
GACTTCAGCC 
CGGAACCTCC 
CACCAGTCCT 
CTCGGACTCG 
AATCCAAGCC 
GCTTTCAAGT 
GTTTATTTTC 
TAGATTGATA 
CCTTTCAATG 
AGTTTTCCTT 
AAAAAAAACT 
TCAAGAATGG 
TCCTGATTGA 
AAACCAACAA 
GAGAATAGAC 
AATTCCTGGT 
TGGAGTAAGG 
TACATGAACT 
CACTGAAAAC 
CAAGAAATCA 
GGCTCCTTTT 
ACCAGAAAAA 
AAACAGTAAC 
TTTTTGATAG 
CTAATTATAT 
CCTAATATTT 
TTCTCTTTAC 
TAACACTGCT 
TTGAAGGTCT 
CCTTGGTAAT 
TACTGAATGG 
GTGGTGGTGG 
CAAAATTAAG 
GCTGATGTGC 
ACAAGCTTCA 
GGTAGTACTG 
CCTCTCCTTT 
ATTTCTTTCT 
GGAAAAAGTG 
TAAAATTCAG 
GTCCATCACC 
AGAATCCAGA 
CGGTATATGT 
CCAACAATNT 



41 
I 

GGGGAAAGGG 
AAAGGAGAAT 
GCTTGGAATA 
CTTCCTAAGC 
TGTCAGGCTA 
GCCAGTTTTA 
AGGTTCCACC 
AGGAACATGC 
AGCTTCTACC 
CGTTTCTACC 
CGTTTTCGCA 
AACCTGCCTC 
ATGGATGAAC 
GTGGAGTACA 
ATGAAAGCCC 
TTTGTGCGCC 
CGTGTGCTTC 
GAAGCCATCA 
GTAACTTGTC 
AAATTTCGCT 
GGAAACCCAT 
AAGAGCCCTG 
AGCAGCCAGC 
GGCAGCCTCC 
CTTGGTGATT 
TATCATTGTA 
TTTTATTTTG 
CTGTGATTTC 
ACCATCCTTT 
GAAAGCTTAG 
TTGTAGAGGG 
CATTGGCAGA 
TGGAGGACCC 
GGCCCGGGTT 
CTTATAATTG 
AGAATTGGAA 
CAATCTCTCC 
GACTTCATAT 
GTATGGTATC 
ACTGTGAGAA 
ATATCCTTTT 
TTATAAACTG 
AAAAAAAAAG 
CTCCAGGAGA 
CTTTTTAACA 
TTTAAATGAA 
CATACTGAAG 
TAAGGACTGA 
TTTTTTTTCC 
TGCTTGCCAN 
GGTGCAAATN 
CTTGCAGTTG 
TGGGAGGGGG 
AAATACTACA 
ACTGTGTAGG 
GATGTCTGTA 
GTTTCTGGTG 
TTTTGTAAAC 
AGAAGCTCAG 
ATACTTGGAT 
GCCAGAACCC 
TTTATTTTAA 
TTTCCTTATG 
ACTACTAGAA 
TCAATAACAA 



51 
I 

TTTGATGAAT 
CTTAGTTTAT 
AAATGAAAAC 
TGGAGATGCT 
TGGATTCATT 
ATGTATTTAG 
AAAATATGGA 
GAGGCTCCCG 
GAACCAGAAC 
GCAATGGGGA 
GCTTTGACGC 
AGGGAGTGCG 
TGGAGGAAGG 
CCAAGAATGT 
CCCAGTCCTT 
CCAAGCTGGT 
TGAACAAGAA 
AACTGGAGAC 
TCCATGATTT 
ATGCTCAGGA 
CAGCCACAGC 
GTCCTATGCG 
TCTCTACCCC 
GGAAGCACAA 
CCATGTAAAG 
GTAGGGTACT 
TTGTTGTTGT 
TCCTCTGGGC 
GGGGCAGCAT 
GGGCCTGGGG 
GTGTTTAAAT 
TCCAAGAATG 
TGGAAGGACA 
TGTTGTCCAG 
TGACACCAGA 
AATACTGCAG 
ACTGAGGCAA 
ACCTGATTCC 
CATCTATCTC 
TTTGTTTTCA 
TATAAAAATT 
GTGATTTTTC 
CCGAAGAATA 
AAACAAGATG 
GTTATGCTTG 
ATGTTACAGC 
ACACAGAAAT 
TCATTTGAAA 
TGTAAACATA 
TCCTGTGTTG 
TTTGGAAAGG 
TTCCTCCACT 
TGGTGCATGT 
AGACACCCCT 
GGGGAACCCA 
GCTACCAAAA 
AAATTGAAAA 
CCATTCAAAA 
GGTTTNCTTA 
TAGGGGGTGT 
CCAATGACTC 
GTTGAGGAAG 
CTTGGGCCTC 
AATACCAAAT 
TAGTACATCT 



60 
120 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2260 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
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TCCATCTTAC TTTTAATCGA GTATAAGGAA ATCTTTCTTT ATGGCCATTT TGGAGGGAGC 3960 

AGGGGATGAG GCTTGGCATA GTCCAAAATT TAAGNCTCCA ATAATTAATT GCATTTTAAA 4020 

TTGTTTTAAA TTGGCCCACT TTCAAGGCAA TTTTTTTTGT GTGTCTGTAA CTGAQCTCCT 4080 

CCACCCCTGT CATTCACTTC CAATTTTACC CAATCCAATT TTAGCACTCA AGTTCCATTG 4140 

TGTTAATTTT TGCACGGTCT ACACACATCA AGTCAGCAAG CATTTGCCAC CACTCCCTAT 4200 

ACTTCTCCCT CTTTTTTACA CACACACACA CACACACACA CACAATCCAT CTCTTGCTTG 4260 

TTCCTACCTC CCTGATTTTT CTTCCCTACA GAAATAGAAA TAGGGACAAA GAAGGGGAAA 4320 

ATGTATATAT TGGGGCTGGG CTGAACAACT AACTTCATAA GTAGTATTAA CTAGGGGTAA 4380 

ATTGAGAGAA AAGCTCCTTT TCTCTTCACT GTTTTGGAAA GGATAGCCAT TAGCATGACT 4440 

GCTTTGTGTC CTTATGGACT TTAGTATTAG CCTAGATTGA ATTATAGCGT TTTTCTAGCT 4500 

GAAGGAACCT TAAGATCACA TCATCTACTC CTCTACTCCA AATTTCTCAT TCTTCAGGCC 4560 

AGGAAACCGA GACACAGAGG TAAAGTAATT TCCCCAAGGT CACACAGCTG GCTGGGGCAQ 4620 

GATTGGGTTT ACAACCCACA TCTCCTGGCT CTTATTCCAG GGCCTTTTCC CACTAAGTAG 4680 

TATTGCCTTC CATTAGGCTC CTGAGAGTTA TTTCTCAGGG TCATGTTGCA TCTTGGAGCC 4740 

ACATGCTGCT GCCCTGATCT CAGTGGGAAA TNCACCCAGC AACCTAATAC AGCCCCTTTT 4800 

CCCTGCATTC ACCTGGTTCC CATCCACATG GGTTGCAGAT GTCCTTGAAG AGAGTGAGGC 4860 

ATTGAGGGCC AATAGGAGCA ATGGGGTCCC TGGCCTTGTC CATCTGATTC AGGAGATCAC 4920 

TGCTCCATCG TGAGGAGCCC TCTGAATAGC CCCCCACTGA ATGCTTGCCT TGCCCAAATG 4980 

GAATGGAGGA AGATTGATTT TCTCCATCAG TTCACCTTGT GTCATCTCAT AATGGTTGGT 5040 

CTTTCCAGGC TGAGGGAAAT GTTTCTTGTT TCCANAGTAN AAAAAAGAAA GAGTGGAACA 5100 

ATANCTTTGT TCATCCTAAC TTTCTGAGAT GGCTTTTCAA CATTTAAAAA AAACTAGTGT 5160 

GGTACCATTC ACTGGCANGA TTTNTTTTAG AATATGGGAG TAAGATGAGG TAGAGAAAAT 5220 

AACCTGGTCT CACTGTGGTT GCCCTCATCC ACAATGTCCC CAAAGCCATC CTGCTNTGAT 5280 

GAGGACAATT TCCAGGTATA AGCAAGGGGC TTTGTGACAA AAATGTACCC TGGCTGATGT 5340 

TAAACATTGG CTCCTGTGTT TGCACCAAAA TAGCAAGCTG TGTGCTCTAT ACACTCTTCC 5400 

CATCGTCTTG TGTACACTGC TCCTGTGGCC TTCCACAGCA GAAACCAGGG CAAAAGGGTC 5460 

CAAACACATG GTTTTCCTTG CTGCAAGGCT NTTCCTGGGA ACTAAGGGGG TATTTATTAG 5520 

TTCAGTTNTA AGAGACCTCC TTCTGGGCTT ACCCCACTCC TCAGGTACTT CTCTCTCCTT 5580 

CCTCCTTCTC CTCCACAGTC ACAAGTAACC AAGGAACCTG AAAGTGGATG TGTAGCTATT 5640 

TGAAGAAGGC AAGGAACCCT GAGATTCTTC TTTGAATCCT TTAGTCCAAG TCTTAGACCA 5700 

GTGATTGGTG CTTACCTTGA ACAAAATTTT GTCTGTGTTC CTAATCCCTT CAATACTNTG 5760 

GGTACAATGC TCCCAATCAC CCTGCACATT TGATTCTAAA TGGCTTTTAT TTTTTAAAAA 5820 

TCCATATCCC TAGGACAAGA NAACAGGATG CCTATATCCC CAAAATGAGC TCCAGGACAC 5880 

TGATGGGAAT GATCCCAANG ATCACCCCAC CTCAGAAAAC GTCTGTGCCA ANAGACTTCC 5940 

CCAGATAGAA NCACTGGGAC AGTGGTTTGA ACGACTTCTT TTATGGTTGT CCAGTTTGCT 6000 

ATGGAAATAA AAGGCATTGA TTTTTTAAAA AAGATGATTG GAACCTGTCT TTGGCCACAT 6060 

AGGGCCACTT GGATCCATTT CCAGGCCTTA CTCATATATT GCCTTCACTG AAGGGCTTTG 6120 

GCTTTAAGTC CCAGACTGGT CTCCCAAGTG AACCATAAGT GTTTTGGAGC TCATCTGGGG 6180 

TGAGGCATGA GAATGTTGCC CCATCTATCC CTTCAGGAAA AGGTGCCTTC CCTCCCTTTC 6240 

TCCTAAAGCC TGGTCCCCAA AAATTGTTTT TGTCTCCAAA AGTCTAGTAT GGTCTTTATA 6300 

CACCCANACT CTTAGTGTTG CGTCCTGCCT TGTTTCCTTG TTAAGGATCT ATGCANACCT 6360 

CCCGCTTTGG CTTAGCTAGC GTGACATTGG CTATCATTTG ACAAGACTAA CTTTTTTTTT 6420 

TTTTTTTTTG ACTGAGTCTC CCTCTGTCAC CTAGGCTGGA GTGCAGTGGC ACAATCTTGG 6480 

CTCGCTGCAA CCTTCACCCT TCACCTCCCA GGTCGAAGCG ATTCTCCTGC CTCAGTCTCC 6540 

CGAGTAGCTG GGATTACAGG CGTGCGCCAC CAAATCTGGC TATTTTTTTA TTATTATTAT 6600 

TTTTAGTAGA GATGGGGTTT CACCATGTTG GCCAGACTGG TCTTGAACTC TTGGCCTCAA 6660 

ATTATCTGCC CACCTCGGCC TCCCAAAGTG CTGGGATTAC AGGCATGAGC ACCATGCCCA 6720 

GCTGACAAGA CTAATTTTTT ATCCCTTGGT TTATTGGCTT CAACATCTTC TGGAATCAGA 6780 

GGTGATTTTT TCTTACCTTG GATGCCTGAG ACTAGGGGAG TATAGAATTC CAATTGGTAA 6840 

TTAAGGCATC TTTCTGCTCC TGATCAGAAG GGCAGGTTAG TTGGGAGAGG TCAGATGGCA 6900 

CAACAGAAGT CACCTTGTAA GTAAGGCAAA GACTTTGAAG GCATTAGCGT TTCTCATTAC 6960 

TTAGGTCAAT AACCTTGAGG GAATCAATGG CTTTTTTGCC GCTCTACCTC TTTGTGTATC 7020 

TCTTTGACTT TTCTTTCTCT GTCTAGTTTC CTCTGTTCTC AGTTTATATT CTATGTTATC 7080 

AGTCTCTCTT TCCACAGTAC AAACATCCAT CCTTTCTCCT GTGCAATTCT GTCTCTCCCT 7140 

CTTATTATCT TTATTTGTAC TTTTTCCTTC CTCCCTGTCT AGGCATTGGG CATGTGCCTC 7200 

TTCTTAGCCT GTGATTTTGC CTTGGGACTG ATGATAAATT ATTTCCAGAT TCAATCAGCC 7260 

CTGGTCCTAC CCCAGTCCAA TCAGAAGTAT GTTGGTGGGG AATCAACCTG ATCCTGGCCC 7320 

TTTCTTCTTC TCCATTTTCA TTCGTAATCC CCCTCAGCAG ATCTTTACAA GCAGTTTCCT 7380 

TATAGCTCAT GTATCTTTAG GTCTTTGCCT TCCAAGCACT GTACAGAATA CTTTGTGGTT 7440 

CCTTTTTAGT CTGACATTTT GTGGAGCAGT GAAGCGTGCT CAGAGACATA ATCAGCTGAA 7500 

GAGAAAAAAT CCACCCATGG ATTTATATCA GCTAAATACT AATAATTGAT TTTGTTTGAT 7560 

GTGCCCATAA TTTTTAAAGC TGCAATATAA TATAATGAGG GACCACAGGT AATTTCTCCT 7620 

GTCATTTGTT TTGGCTGGAT GGGGGTGGGG GAGTAATTGC TTAAAGTTTT ACCATTACAC 7680 

ATTAAACTCT CTATAATAAT CTTGTTTGGG GCTTGCTAAC TGTTGAGCTG TTTTAACTAA 7740 

ACTGGTAGGC AATCGGAGTT GATTTAAATG AAAAGATAAT TTAACAAATC TATACTATAA 7800 

AAAGAGACAT TTGCTTAATT GACATGTATT TTTTCCTTCT GAGTCACCTA AACATTTACT 7860 

CTTGACACCA ACTGTTCATG ATACTGAATA GACAGTCCAT ATAAGAGAAA TTAGTGGACC 7920 

TAAAGAAGCC AGATTGTAGG TGTTAATTTA TTAAACAGAA TTGCAAAGCC CTTGGAAATG 7980 

TCACTGCTTG GCAATACCAT ATGGCATGCC AAAATTTACA ATGACTTTTC TTTATAAGTT 8040 

ATCCAAAAGG GATTTGAACA AGTAAGAGGT TATGCCAAAA TGTCTCCAAT GTATGGTCCT 8100 

GTAATATATT GCAGCTTGAA GCCAATGATC CCTTATGACT TGTATACAAC TAATGCATGT 8160 

TTTATTGAAT TTTGCATTTC CCACGTGTGG TAAGTCTTTA AAATGTTTTT GATCACCTTT 8220 

NTGTGCCATT AAACTTGTAC AGAAAATGTT TTTATGGCCA TTTTCAAAGG GAGAAAGTTT 8280 

AAAATGGAAA CAGCCCACCC TTTCTGCCCT ATAGCTGTAG TTAGAATTGA GTACCTGTAG 8340 

CAAAACAGCT GTAATTGGTG GTTGTAGTGT TAGAGGTGTT AGCTTGCTAG TGACTAGCTT 8400 

TGGAGAGTAA ATGCATGGTA TTGTACATCA CATTTCTTAA CTCGTTTTAA CCTCTGAAAA 8460 

GAATATATTC TTCTTTGTAG TCCTTCTTCC CACCCCCTTG CCCTCTCCCT CTCCCTGCTC 8520 

CCAGTTGTCT TACAGTTGTA AATATCTGAT TTGAGGCCCA ATAACTCTTG CCAAGTAAAG 8580 

TCAGCAAACA ACAAACAAAC CAAAATGTGG GGAAAAGGCA TTTCTCAACC ATCTCTCAGC 8640 

AGTTATTGAT CATTTCTTAA GGAACAGCAT TGTGATCAAA GACTCAACTT TACGTAAAAA 8700 

rCAGTGGTAA ATTGGGGTTG TATTGGCCAT TGATTACATT CAGGATTGAA TAGTTTTCAG 8760 

AATCACATGT AATCCAAAGA CAGTAGGTAG TGATGTCCCT TATCCCTGCA GCTGTTTTAA 8820 

GATAGAGACC TCAGAAGACT CTGCTTGACC GATGACCAAT AATTATTTGA AAAAAAAAGA 8880 

AAAAATGAGA GAAATAAAAC AGATATTTAA GAACTTTAGC CACCTATTTA GAATAGTTAT 8940 

AGCCAGAAAA AAAAACAAGG GCATGAGTTC AAATGCATTA CTATCAGTGT CCTAGGCAAT 9000 

ACCTAACCTA CTCTGAAATT GTGATTCAAA AGCAGTATTT CAAGAGGCAT TCTCCTTTTT 9060 

TGGTTTGCTG ACCCCACTTG GACTGGTAGG TTTGGTGAGG CCCCCATAAA CCAGCTGGAG 9120 
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CAGACCCTTT TCATCTCCTG TGCCTGTAAC ACCCCTCTTC CCCCACCCCC TCCGCAATTC 9180 

AATGAGGGCT TTCTTGGGTC AGAGGACTTC AAGGTTGTCT AGAGAAGTTT GCCATGTGTG 9240 

TAAGGTGCTG TGAACTGTGA GTGCTGAAGA TTCGCAGCAT TCAATACCAG GCAGCCAAAG 9300 

AGCTGCTCTT GCAATTATTT TGGCTCTCAA GCTCTGTTCT TCATCGCATT CTCATTTCTG 9360 
TGTACATTTG CAAGATGTGT GTAATGTCAT TTTCCAAAAA TAAAATTTGA TTTCAAT 



PCT/US02/12476 



10 
15 
20 
25 
30 
35 
40 



Seq ID NO: 214 Protein sequence: 
protein Accession ft: NP 000546 



i 

MELDFGHFDE 
GDRYFKGIVY 
EGESYVCSSD 
LVTIIRSGVK 
DFFGDDDVFI 
MRRSKSPADS 



11 
I 

RDKTSRNMRG 
AVSSDRFRSF 
NFFKKVEYTK 
PRKAVRVLLN 
AOGPBKFRYA 
ANGTSSSQLS 



21 
I 

SRMNGLPSPT 
DALLADLTRS 
NVNPNWSVNV 
KKTAHSFEQV 
QDDFSLDENE 
TPKSKQSPIS 



31 
I 

HSAHCSFYRT 
LSDNINLPQG 
KTSANMKAPQ 
LTD I TEA I KL 
CRVMKGNPSA 
TPTSPGSLRK 



41 
I 

RTLQALSNEK 
VRYIYTIDGS 
SLASSNSAQA 
ETGWKKLYT 
TAGPKASPTP 
HKDLYLPLSL 



Seq ID NO: 215 DNA sequence 
Nucleic Acid Accession ft: NMJ.30467 
Coding sequence : 3 12 . . 6 4 4 



GGCACGAGGC 
CTTTCCAACA 
GTCTTCCTGG 
TCCTGTGGCA 
CCCAGGTCGT 
AAGTGAGAGA 
AAGAGTCTTC 
AAGAAGAGGA 
AAGGAGCACC 
TTAAGATAGA 
TTGATCCCAC 
ATGAAGACTG 
TAATAAAGTT 



11 
I 

AGAGCTCTGC 
TCTTCGTTCT 
TAATTTAGTT 
CAGTCCGTGG 
GATGCAGGCG 
TATGAGTGAG 
CCAGCCAGTT 
ACCACCAACT 
TGCTGTTCAA 
GGATGCACCT 
TAAAGTGCTG 
AAACCAAGAA 
TTACAGTTTT 



21 
I 

AAGGAGAGGT 
TTCTCACTGA 
GTGAGTGAAT 
CTTTGAGGGA 
CCATGGGCCG 
CATGTAACAA 
GGACCTGTGA 
GATAATCAGG 
GGGACTGATG 
GGAGATGGTC 
GAAGCAGGTG 
TATTGTTCTT 
CTGCAAAAAA 



31 

I 

TGTGTCTTCG 
CCGAGACTCA 
GTGTGGAGGA 
AAAGGGCCTC 
GTAATCGTGG 
GATCCCAATC 
TTGTCCAGCA 
GTATTGCACC 
TGGAAGCTTT 
CTGATGTCAG 
AAGGGCAACT 
ATGCTGGAAA 
AAAAAAAAAA 



41 

I 

TTCTTTCCGC 
GCCGGTAGGT 
GCCAGCGGGC 
GCGGTGGTCC 
CTGGGCTGGA 
CTCAGAAAGA 
GCCCACTGAG 
TAGTGGGGAG 
TCAACAGGAA 
GGAGGGGACT 
ATAGGTTTAA 
TTTGACTGCT 
AAA 



51 
I 

KAKKVRFYRN 
RKIGSMDELE 
RENKDFVRPK 
LDGKQVTCLH 
QKTSAKSPGP 
DDSDSLGDSM 



51 
I 

CATCTTCGTT 
CTGCAGAGTG 
TTAGGACAGG 
TCCGCCTTCC 
ACGAGGGAGG 
GGAAATGACC 
GAAAAACGTC 
ATCAAAAATG 
CTGGCTCTGC 
CTGCCCACTT 
ACCAAGACAA 
AACATTCTCT 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO: 216 Protein sequence: 
Protein Accession ft: NP_569734 

1 11 21 31 41 51 

I I I I I I 

MSBHVTRSQS SERGNDQESS QPVGPVIVQQ PTEEKRQEEE PPTDNQGIAP SGEIKNEGAP 
AVQGTDVEAF QQELALLKIE DAPGDGPDVR EGTLPTFDPT KVLEAGEGQL 

Seq ID NO: 217 DNA sequence 

Nucleic Acid Accession ft: NMJJ01476.1 

Coding sequence: 82.. 435 



GCCAGGGAGC 
TGAGATTCAT 
CCAAGGCGCT 
GATGAAGTGG 
GCAGCTGCTC 
GCTGATAGCC 
GGGCAGGAGG 
CAATCACAGT 
TTGTTCATTA 



11 
I 

TGTGAGGCAG 
CTGTGTGAAA 
ATGTACAGCC 
AACCAGCAAC 
AGGAGGGAGA 
AGGAACAGGG 
TGGACCCGCC 
GTTAAAAGAA 
AAATTCTCCC 



21 
I 

TGCTGTGTGG 
TATGAGTTGG 
TCCTGAAGTG 
ACCTGAAGAA 
GGATGAGGGA 
TCACCCACAG 
AAATCCAGAG 
GACACGTTGA 
AATAAAGCTT 



31 
I 

TTCCTGCCGT 
CGAGGAAGAT 
ATTGGGCCTA 
GGGGAACCAG 
GCATCTGCAG 
ACTGGGTGTG 
GAGGTGAAAA 
AATGATGCAG 
TACAGCCTTC 



41 

! 

CCGGACTCTT 
CGACCTATTA 
TGCGGCCCGA 
CAACTCAACG 
GTCAAGGGCC 
AGTGTGAAGA 
CGCCTGAAGA 
GCTGCTCCTA 
TGCAAAA 



51 
I 

TTTCCTCTAC 
TTGGCCTAGA 
GCAGTTCAGT 
TCAGGATCCT 
GAAGCCTGAA 
TGGTCCTGAT 
AGGTGAAAAG 
TGTTGGAAAT 



Seq ID NO: 218 Protein sequence: 
Protein Accession ft: NP_001467.1 

1 11 21 31 41 51 

I I I I I I 

MSWRGRSTYY WPRPRRYVQP PEVIGPMRPE QFSDEVEPAT PEEGEPATQR QDPAAAQEGE 
DEGASAGQGP KPEADSQEQG HPQTGCECED GPDGQEVDPP NPEEVKTPEE GEKQSQC 

Seq ID NO: 219 DNA sequence 
Nucleic Acid Accession ft: NM_001476 
Coding sequence: 90-3671 



I 

ACAGCGGAGC 
AGACAGAGAC 
GCTTCTCGCT 
ATGGGAAGTC 
TCCGCTGCCT 
GCTTTTACCG 



11 
I 

GCAGAGTGAG 
TGAGCGGCCC 
CCTCCTGCCC 
CAGGCAGTGT 
CAACTGCAAT 
GCACAGAGAA 



21 
I 

AACCACCAAC 
GGCACCGCCA 
GCAGCCCGGG 
ATCTTTGATC 
GACAACACTG 
AGGGACCGCT 



31 
I 

CGAGGCGCCG 
TGCCTGCGCT 
CCACCTCCAG 
GGGAACTTCA 
ATGGCATTCA 
GTTTGCCCTG 



41 

I 

GGCAGCGACC 
CTGGCTGGGC 
GAGGGAAGTC 
CAGACAAACT 
CTGCGAGAAG 
CAATTGTAAC 



51 

I 

CCTGCAGCGG 
TGCTGCCTCT 
TGTGATTGCA 
GGTAATGGAT 
TGCAAGAATG 
TCCAAAGGTT 



60 



60 
120 
180 
240 
300 
360 
420 
480 



60 



60 
120 
180 
240 
300 
360 
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CTCTTAGTGC TCGATGTGAC AACTCTGGAC GGTGCAGCTG TAAACCAGGT GTGACAGGAG 420 

CCAGATGCGA COGATGTCTG CCAGGCTTCC ACATGCTCAC GGATGCGGGG TGCACCCAAG 480 

ACCAGAGACT GCTAGACTCC AAGTGTGACT GTGACCCAGC TGGCATCGCA GGGCCCTGTG 540 

ACGCGGGCCG CTGTGTCTGC AAGCCAGCTG TTACTGGAGA ACGCTGTGAT AGGTGTCGAT 600 

CAGGTTACTA TAATCTGGAT GGGGGGAACC CTGAGGGCTG TACCCAGTGT TTCTGCTATG 660 

GGCATTCAGC CAGCTGCCGC AGCTCTGCAG AATACAGTGT CCATAAGATC ACCTCTACCT 720 

TTCATCAAGA TGTTGATGGC TGGAAGGCTG TCCAACGAAA TGGGTCTCCT GCAAAGCTCC 780 

AATGGTCACA GCGCCATCAA GATGTGTTTA GCTCAGCCCA ACGACTAGAC CCTGTCTATT 840 

TTGTGGCTCC TGCCAAATTT CTTGGGAATC AACAGGTGAG CTATGGGCAA AGCCTGTCCT 900 

TTGACTACCG TGTGGACAGA GGAGGCAGAC ACCCATCTGC CCATGATGTG ATTCTGGAAG 960 

GTGCTGGTCT ACGGATCACA GCTCCCTTGA TGCCACTTGG CAAGACACTG CCTTGTGGGC 1020 

TCACCAAGAC TTACACATTC AGGTTAAATG AGCATCCAAG CAATAATTGG AGCCCCCAGC 1080 

TGAGTTACTT TGAGTATCGA AGGTTACTGC GGAATCTCAC AGCCCTCCGC ATCCGAGCTA 1140 

CATATGGAGA ATACAGTACT GGGTACATTG ACAATGTGAC CCTGATTTCA GCCCGCCCTG 1200 

TCTCTGGAGC CCCAGCACCC TGGGTTGAAC AGTGTATATG TCCTGTTGGG TACA AGGGGC 1260 

AATTCTGCCA GGATTGTGCT TCTGGCTACA AGAGAGATTC AGCGAGACTG GGGCCTTTTG 1320 

GCACCTGTAT TCCTTGTAAC TGTCAAGGGG GAGGGGCCTG TGATCCAGAC ACAGGAGATT 1380 

GTTATTCAGG GGATGAGAAT CCTGACATTG AGTGTGCTGA CTGCCCAATT GGTTTCTACA 1440 

ACGATCCGCA CGACCCCCGC AGCTGCAAGC CATGTCCCTG TCATAACGGG TTCAGCTGCT 1500 

CAGTGATGCC GGAGACGGAG GAGGTGGTGT GCAATAACTG CCCTCCCGGG GTCACCGGTG 1560 

CCCGCTGTGA GCTCTGTGCT GATGGCTACT TTGGGGACCC CTTTGGTGAA CATGGCCCAG 1620 

TGAGGCCTTG TCAGCCCTGT CAATGCAACA ACAATGTGGA CCCCAGTGCC TCTGGGAATT 1680 

GTGACCGGCT GACAGGCAGG TGTTTGAAGT GTATCCACAA CACAGCCGGC ATCTACTGCG 1740 

ACCAGTGCAA AGCAGGCTAC TTCGGGGACC CATTGGCTCC CAACCCAGCA GACAAGTGTC 1800 

GAGCTTGCAA CTGTAACCCC ATGGGCTCAG AGCCTGTAGG ATGTCGAAGT GATGGCACCT 1860 

GTGTTTGCAA GCCAGGATTT GGTGGCCCCA ACTGTGAGCA TGGAGCATTC AGCTGTCCAG 1920 

CTTGCTATAA TCAAGTGAAG ATTCAGATGG ATCAGTTTAT GCAGCAGCTT CAGAGAATGG 1980 

AGGCCCTGAT TTCAAAGGCT CAGGGTGGTG ATGGAGTAGT ACCTGATACA GAGCTGGAAG 2040 

GCAGGATGCA GCAGGCTGAG CAGGCCCTTC AGGACATTCT GAGAGATGCC CAGATTTCAG 2100 

AAGGTGCTAG CAGATCCCTT GGTCTCCAGT TGGCCAAGGT GAGGAGCCAA GAGAACAGCT 2160 

ACCAGAGCCG CCTGGATGAC CTCAAGATGA CTGTGGAAAG AGTTCGGGCT CTGGGAAGTC 2220 

AGTACCAGAA CCGAGTTCGG GATACTCACA GGCTCATCAC TCAGATGCAG CTGAGCCTGG 2280 

CAGAAAGTGA AGCTTCCTTG GGAAACACTA ACATTCCTGC CTCAGACCAC TACGTGGGGC 2340 

CAAATGGCTT TAAAAGTCTG GCTCAGGAGG CCACAAGATT AGCAGAAAGC CACGTTGAGT 2400 

CAGCCAGTAA CATGGAGCAA CTGACAAGGG AAACTGAGGA CTATTCCAAA CAAGCCCTCT 2460 

CACTGGTGCG CAAGGCCCTG CATGAAGGAG TCGGAAGCGG AAGCGGTAGC CCGGACGGTG 2520 

CTGTGGTGCA AGGGCTTGTG GAAAAATTGG AGAAAACCAA GTCCCTGGCC CAGCAGTTGA 25 BO 

CAAGGGAGGC CACTCAAGCG GAAATTGAAG CAGATAGGTC TTATCAGCAC AGTCTCCGCC 2640 

TCCTGGATTC AGTGTCTCGG CTTCAGGGAG TCAGTGATCA GTCCTTTCAG GTGGAAGAAG 2700 

CAAAGAGGAT CAAACAAAAA GCGGATTCAC TCTCAACGCT GGTAACCAGG CATATGGATG 2760 

AGTTCAAGCG TACACAAAAG AATCTGGGAA ACTGGAAAGA AGAAGCACAG CAGCTCTTAC 2820 

AGAATGGAAA AAGTGGGAGA GAGAAATCAG ATCAGCTGCT TTCCCGTGCC AATCTTGCTA 2880 

AAAGCAGAGC ACAAGAAGCA CTGAGTATGG GCAATGCCAC TTTTTATGAA GTTGAGAGCA 2940 

TCCTTAAAAA CCTCAGAGAG TTTGACCTGC AGGTGGACAA CAGAAAAGCA GAAGCTGAAG 3000 

AAGCCATGAA GAGACTCTCC TACATCAGCC AGAAGGTTTC AGATGCCAGT GACAAGACCC 3060 

AGCAAGCAGA AAGAGCCCTG GGGAGCGCTG CTGCTGATGC ACAGAGGGCA AAGAATGGGG 3120 

CCGGGGAGGC CCTGGAAATC TCCAGTGAGA TTGAACAGGA GATTGGGAGT CTGAACTTGG 3180 

AAGCCAATGT GACAGCAGAT GGAGCCTTGG CCATGGAAAA GGGACTGGCC TCTCTGAAGA 3240 

GTGAGATGAG GGAAGTGGAA GGAGAGCTGG AAAGGAAGGA GCTGGAGTTT GACACGAATA 3300 

TGGATGCAGT ACAQATGGTG ATTACAGAAG CCCAGAAGGT TGATACCAGA GCCAAGAACG 3360 

CTGGGGTTAC AATCCAAGAC ACACTCAACA CATTAGACGG CCTCCTGCAT CTGATGGACC 3420 

AGCCTCTCAG TGTAGATGAA GAGGGGCTGG TCTTACTGGA GCAGAAGCTT TCCCGAGCCA 3480 

AGACCCAGAT CAACAGCCAA CTGCGGCCCA TGATGTCAGA GCTGGAAGAG AGGGCACGTC 3540 

AGCAGAGGGG CCACCTCCAT TTGCTGGAGA CAAGCATAGA TGGGATTCTG GCTGATGTGA 3600 

AGAACTTGGA GAACATTAGG GACAACCTGC CCCCAGGCTG CTACAATACC CAGGCTCTTG 3660 

AGCAACAGTG AAGCTGCCAT AAATATTTCT CAACTGAGGT TCTTGGGATA CAGATCTCAG 3720 

GGCTCGGGAG CCATGTCATG TGAGTGGGTG GGATGGGGAC ATTTGAACAT GTTTAATGGG 3780 

TATGCTCAGG TCAACTGACC TGACCCCATT CCTGATCCCA TGGCCAGGTG GTTGTCTTAT 3840 

TGCACCATAC TCCTTGCTTC CTGATGCTGG GCAATGAGGC AGATAGCACT GGGTGTGAGA 3900 

ATGATCAAGG ATCTGGACCC CAAAGAATAG ACTGGATGGA AAGACAAACT GCACAGGCAG 3960 

ATGTTTGCCT CATAATAGTC GTAAGTGGAG TCCTGGAATT TGGACAAGTG CTGTTGGGAT 4020 

ATAGTCAACT TATTCTTTGA GTAATGTGAC TAAAGGAAAA AACTTTGACT TTGCCCAGGC 4080 

ATGAAATTCT TCCTAATGTC AGAACAGAGT GCAACCCAGT CACACTGTGG CCAGTAAAAT 4140 

ACTATTGCCT CATATTGTCC TCTGCAAGCT TCTTGCTGAT CAGAGTTCCT CCTACTTACA 4200 

ACCCAGGGTG "TGAACATGTT CTCCATTTTC AAGCTGGAAG AAGTGAGCAG TGTTGGAGTG 4260 

AGGACCTGTA AGGCAGGCCC ATTCAGAGCT ATGGTGCTTG CTGGTGCCTG CCACCTTCAA 4320 

GTTCTGGACC TGGGCATGAC ATCCTTTCTT TTAATGATGC CATGGCAACT TAGAGATTGC 4380 

ATTTTTATTA AAGCATTTCC TACCAGCAAA GCAAATGTTG GGAAAGTATT TACTTTTTCG 4440 

GTTTCAAAGT GATAGAAAAG TGTGGCTTGG GCATTGAAAG AGGTAAAATT CTCTAGATTT 4500 

ATTAGTCCTA ATTCAATCCT ACTTTTCGAA CACCAAAAAT GATGCGCATC AATGTATTTT 4560 

ATCTTATTTT CTCAATCTCC TCTCTCTTTC CTCCACCCAT AATAAGAGAA TGTTCCTACT 4620 

CACACTTCAG CTGGGTCACA TCCATCCCTC CATTCATCCT TCCATCCATC TTTCCATCCA 4680 

TTACCTCCAT CCATCCTTCC AACATATATT TATTGAGTAC CTACTGTGTG CCAGGGGCTG 4740 

GTGGGACAGT GGTGACATAG TCTCTGCCCT CATAGAGTTG ATTGTCTAGT GA GGAAG ACA 4800 

AGCATTTTTA AAAAATAAAT TTAAACTTAC AAACTTTGTT TGTCACAAGT GGTGTTTATT 4660 

GCAATAACCG CTTGGTTTGC AACCTCTTTG CTCAACAGAA CATATGTTGC AAGACCCTCC 4920 

CATGGGGGCA CTTGAGTTTT GGCAAGGCTG ACAGAGCTCT GGGTTGTGCA CATTTCTTTG 4980 

CATTCCAGCT GTCACTCTGT GCCTTTCTAC AACTGATTGC AACAGACTGT TGAGTTATGA 5040 

TAACACCAGT GGGAATTGCT GGAGGAACCA GAGGCACTTC CACCTTGGCT GGGAAGACTA 5100 

TGGTGCTGCC TTGCTTCTGT ATTTCCTTGG ATTTTCCTGA AAGTGTTTTT AAATAAAGAA 5160 
CAATTGTTAG ATGCC 

Seq ID NO: 220 Protein sequence! 
Protein Accession #:NP_005S53 

1 11 21 31 41 51 

I I I I I I 
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WO 02/086443 

MPALWLGCCL CFSLLLPAAR ATSRREVCDC NGKSRQCIFD RELHRQTGNG FRCLNCNDNT 60 

DGIHCEKCKK GFYEHRERDR CLPCNCNSKG SLSARCDNSG RCSCKPGVTG ARCDRCLPGF 120 

HMLTDAGCTQ DQRLLDSKCD CDPAGIAGPC DAGRCVCKPA VTGERCDRCR SGYYNLDGGN 180 

PEGCTQCPCY GHSASCRSSA EYSVHKITST FHQDVDGWKA VQRNGSPAKL QWSQRHQDVF 240 

SSAORLDPVY FVAPAXFLGN QQVSYGQSLS PDYRVDRGGR HPSAHDVILB GAGLRITAPL 300 

MPLGKTLPCG LTKTYTFRLN EHPSNNWSPQ LSYFEYRRLL RNLTALRIRA TYGEYSTGYI 360 

DNVTLISARP VSGAPAPWVE QCICPVGYKG QFCQDCASGY KRDSARLGPF GTCI PCNCQ G 420 

GGACDPDTGD CYSGDENPDI ECADCPIGFY NDPHDPRSCK PCPCHNGFSC SVMPETEEW 480 

CNNCPPGVTG ARCEItCADGY FGDPFGEHGP VRPOQPCQCN NNVDPSASGN CDRLTGRCLK 540 

CIHNTAGIYC DQCKAGYFGD PLAPNPADKC RACNCNPKGS EPVGCRSDGT CVCKPGFGGP 600 

NCEHGAFSCP ACYNQVKIQM DQFMQQLQRM EAL1SKAQGG DGWPDTEbE GRMQQAEQAL 660 

QDILRDAQIS EGASRSLGLQ LAKVRSQENS YQSRLDDLKM TVERVRALGS QYQNRVRDTH 720 

RLITQMQLSL AESEASLGNT NIPASDHYVG PNGFKSLAQE ATRIAESHVB SASNMEQLTR 780 

ETEDYSKQAL SLVRKALHEG VGSGSGSPDG AWQGLVEKL EKTKSLAQQL TREATQAEIE 840 

ADRSYQKSLR LLDSVSRLQG VSDQSFQVEE AKRIKQKADS LSTLVTRHMD EFKRTQKNLG 900 

NWKEEAQQLIi QNGKSGREKS DQLLSRANLA KSRAQEALSM GNATFYEVES ILKNLREFDL 960 

QVDNRKAEAE EAMKRLSYIS QKVSDASDKT QQAERALGSA AADAQRAXNG AGEALEISSE 1020 

IEQEIGSLNL EANVTADGAL AMEKGLASLK SEMREVEGEL ERKELEFDTN MDAVQMVITE 1080 

AQKVDTRAKN AGVTIQDTLN TLDGLLHLMD QPLSVDEEGL VLLEQKLSRA KTQINSQLRP 1140 
MMSELEERAR QQRGHLHLLS TSIDGILADV KNLENIRDNL PPGCYNTQAL EQQ 

Seq ID NO: 221 DNA sequence 
Nucleic Acid Accession #t NM_016529 
Coding Bequencej 13-1854 

1 11 21 31 41 51 

111111 

GTCAAGAAAA GAATGTCTGT AATTGTTCGA ACTCCTTCAG GACGACTTCG GCTTTACTGT 60 

AAAGGGGCTG ATAATGTGAT TTTTGAGAGA CTTTCAAAAG ACTCAAAATA TATGGAGGAA 120 

ACATTATGCC ATCTGGAATA CTTTGCCACG GAAGGCTTGC GGACTCTCTG TGTGGCTTAT 180 

GCTGATCTCT CTGAGAATGA GTATGAGGAG TGGCTGAAAG TCTATCAGGA AGCCAGCACC 240 

ATATTGAAGG ACAGAGCTCA ACGGTTGGAA GAGTGTTACG AGATCATTGA GAAGAATTTG 300 

CTGCTACTTG GAGCCACAGC CATAGAAGAT CGCCTTCAAG CAGGAGTTCC AGAAACCATC 360 

GCAACACTGT TGAAGGCAGA AATTAAAATA TGGGTGTTGA CAGGAGACAA ACAAGAAACT 420 

GCGATTAATA TAGGGTATTC CTGCCGATTG GTATCGCAGA ATATGGCCCT TATCCTATTG 480 

AAGGAGGACT CTTTGGATGC CACAAGGGCA GCCATTACTC AGCACTGCAC TGACCTTGGG 540 

AATTTGCTGG GCAAGGAAAA TGACGTGGCC CTCATCATCG ATGGCCACAC CCTGAAGTAC 600 

GCGCTCTCCT TCGAAGTCCG GAGGAGTTTC CTGGATTTGG CACTCTCGTG CAAAGCGGTC 660 

ATATGCTGCA GAGTGTCTCC TCTGCAGAAG TCTGAGATAG TGGATGTGGT GAAGAAGCGG . 720 

GTGAAGGCCA TCACCCTCGC CATCGGAGAC GGCGCCAACG ATGTCGGGAT GATCCAGACA 780 

GCCCACGTGG GTGTGGGAAT CAGTGGGAAT GAAGGCATGC AGGCCACCAA CAACTCGGAT 840 

TACGCCATCG CACAGTTTTC CTACTTAGAG AAGCTTCTGT TGGTTCATGG AGCCTGGAGC 900 

TACAACCGGG TGACCAAGTG CATCTTGTAC TGCTTCTATA AGAACGTGGT CCTGTATATT 960 

ATTGAGCTTT GGTTCGCCTT TGTTAATGGA TTTTCTGGGC AGATTTTATT TGAACGTTGG 1020 

TGCATCGGCC TGTACAATGT GATTTTCACC GCTTTGCCGC CCTTCACTCT GGGAATCTTT 1080 

GAGAGGTCTT GCACTCAGGA GAGCATGCTC AGGTTTCCCC AGCTCTACAA AATCACCCAG 1140 

AATGGCGAAG GCTTCAACAC AAAGGTTTTC TGGGGTCACT GCATCAACGC CTTGGTCCAC 1200 

TCCCTCATCC TCTTCTGGTT TCCCATGAAA GCTCTGGAGC ATGATACTGT GTTTGACAGT 1260 

GGTCATGCTA CCGACTATTT ATTTGTTGGA AATATTGTTT ACACATATGT TGTTGTTACT 1320 

GTTTGTCTGA AAGCTGGTTT GGAGACCACA GCTTGGACTA AATTCAGTCA TCTGGCTGTC 1380 

TGGGGAAGCA TGCTGACCTG GCTGGTGTTT TTTGGCATCT ACTCGACCAT CTGGCCCACC 1440 

ATTCCCATTG CTCCAGATAT GAGAGGACAG GCAACTATGG TCCTGAGCTC CGCACACTTC 1500 

TGGTTGGGAT TATTTCTGGT TCCTACTGCC TGTTTGATTG AAGATGTGGC ATGGAGAGCA 1560 

GCCAAGCACA CCTGCAAAAA GACATTGCTG GAGGAGGTGC AGGAGCTGGA AACCAAGTCT 1620 

CGAGTCCTGG GAAAAGCGGT GCTGCGGGAT AGCAATGGAA AGAGGCTGAA CGAGCGCGAC 1680 

CGCCTGATCA AGAGGCTGGG CCGGAAGACG CCCCCGACGC TGTTCCGGGG CAGCTCCCTG 1740 

CAGCAGGGCG TCCCGCATGG GTATGCTTTT TCTCAAGAAG AACACGGAGC TGTTAGTCAG 1800. 

GAAGAAGTCA TCCGTGCTTA TGACACCACC AAAAAGAAAT CCAGGAAGAA ATAAGACATG 1860 

AATTTTCCTG ACTGATCTTA GGAAAGAGAT TCAGTTTGTT GCACCCAGTG TTAACACATC 1920 

TTTGTCAGAG AAGACTGGCG TCCAAGGCCA AAACACCAGG AAACACATTT CTGTGGCCTT 1980 

AGTTAAGCAG TTTGTTAGTT ACATATTCCC TCGCAAACCT GGAGTGCAGA CCACAGGGGA 2040 

AGCTATCTTT GCCCTCCCAA CTCGTCTGCA GTGCTTAGCC TAACTTTTGT TTATGTCGTT 2100 

ATGAAGCATT CAACTGTGCT CTGTGAGGTC TCAAATTAAA AACATTATGT TTCACCAATA 2160 
AGAAAAAAAA AAAAAAA 



Seq ID NO: 222 Protein sequence: 
Protein Accession #: NP_057613 

1 11 21 31 41 51 

I I I I I I 

MSVIVRTPSG RLRLYCKGAD NVIFERLSKD SKYMEETLCH LEYFATEGLR TLCVAYADLS 60 

ENEYEEWLKV YQEASTILKD RAQRLEECYE IIEKNLLLLG ATAIEDRLiQA GVPETIATLL 120 

KAEIKIWVLT GDKQETAINI GYSCRLVSQN MALILLKEDS LDATRAAITQ HCTDLGNLLG 180 

KENDVALI ID GHTLKYALSP EVRRSFLDLA LSCKAVICCR VSPLQKSEIV DWKKRVKAI 240 

TLAIGDGAND VGMIQTAHVG VGISGNEO^Q ATNNSDYAIA QFSYLEKLLL VHGAWSYNRV 300 

TKCILYCFYK NWLYIIELW FAPVNGFSGQ ILFERWCIGL YNVIFTALPP FTLGIFERSC 360 

TQESMLRFPQ LYKITQNGEG FNTKVFWGHC INALVHSLIL FWFPMKALEH DTVFDSGHAT 420 

DYLFVGNIVY TYVWTVCLK AGLETTAWTK FSHLAVWGSM LTWLVFFGIY STIWPTIPIA 480 

PDMRGQATKV LSSAHFWLGL FLVPTACLIE DVAWRAAKHT CKKTLLEEVQ ELETKSRVLG 540 

KAVLrRDSNGK RLNERPRLIK RLGRKTPPTL FRGSSLQQGV PHGYAFSQEB HGAVSQEEVI 600 
RAYDTTKKKS RKK 



Seq id NO: 223 DNA sequence 
Nucleic Acid Accession ft: BC017001 
Coding sequence: 1-394 

1 11 21 31 41 51 
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WO 02/086443 

| | I I I I 

AACGCTGGGC AGGGCCGGCG CGGGTCGGGG GGCGCCCGAG GGGCCOGGGC CGAGCGGCGG 60 

CGCGCAGGGC GGCAGCATCC ACTCGGGCCG CATCGCCGCG GTGCACAACG TGCCGCTGAG 120 

CGTGCTCATC CGGCCGCTGC CGTCCGTGTT GGACCCCGCC AAGGTGCAGA GCCTCGTGGA 180 

CACGATCCGG GAGGACCCAG ACAGCGTGCC CCCCATCGAT GTCCTCTGGA TCAAAGGGGC 240 

CCAGGGAGGT GACTACTTCT ACTCCTTTGG GGGCTGCCAC OGCTACGCGG CCTACCAGCA 300 

ACTGCAGCGA GAGACCATCC CCGCCAAGCT TGTCCAGTCC ACTCTCTCAG ACCTAAGGGT 360 

GTACCTGGGA GCATCCACAC CAGACTTGCA GTAGCAGCCT CCTTGGCACC TGCTGCCACC 420 

TTCAAGAGCC CAGAAGACAC ACCTGGCCTC CAGCAGGCTG GGCCATGCAG AAGGGATAGC 480 

AGGGGTGCAT TCTCTTTGCA CCTGGCGAGA GGGTCTGACT CTGGGCACCC CTCTCACCGG 540 

CTACAAGGCC TTGGACTCAC TGTACAGTGT GGGAGCCCCA GTTCCCACCT CTGTGACAAT 600 

AGGATCATGG CCTTACCCTT GAAGCATTAC CGAGAAGGAG AACAGAGATG GGCTTGAAGA 660 

GCCACGTGCT GCCGGCTCCA AATTCCCAAG GACAAGGATC CCTCTGCATT TTTGTCTATG 720 

TAACCTCTTA TATGGACTAC ATTCAGCTGC AAGGAAAGGA AAACCTTGAT TGCAGTGGTT 780 

TAAACAAACA GAAGATTGTT TTTCCACATA GCATGGATTC TGGAGATGGG TGGCTAATGG 840 

TATTGGTTCA ACAACTCCAC GGAGGTAGGG GTCACGTCTT GGATCCTTTT GCCTTAATCT 900 

CAGTGCTCGT TACTTCATGG TCCCAAGATG GCTGCTGTAT CCCCAAGAAT CATGTCTGCG 960 

TTCAAGGAAG GAGGGGTGGA GGAAGAGGAA GGGCCAAACT AGCTGGACCC GTCACCTTCT 1020 

ATCAGAAAGT AAAACCTCGT CAGAAGTCTG TTTCCTGCTC TCTCCCTCTG CATATCTTCA 1080 

CTTAGATGCC CTTGGCCCGA GCCAGCTACC ATTGCACCTC TAGCTGCAAA CAAAGCTAAG 1140 

ACAGCAGGGA ACAGAATTGT CATGGCTGAA TAGACCAATC GTGTTCCATC TACTGAGACT 1200 

GGCACACTGC CTCCTGCAAT AAAACTGGGA TCCCATTACC AAGAGAGAAA TGCAGAATTG 1260 

TGTACCAGTT AGCTTTTGCT GTGTAACAAA CCATCCCCAA ACTTGGCAGC TAGAAACAAA 1320 

CCCTGTATTT TCCCACAATC CTATGGGTTG GCAATTTGGG CTGGGCTCAA CAGGGCAGTT 1380 

CTGCTGCTCA CACCTGGGAT CCCTCATGGA GCTAAGGTCA GCTGTTACCT CAGCTGGGCC 1440 

TGGATGGTCT AGGATAGCCT TACTCACTTG CCTGGCAGGT GACAGGCTGT TGGCTGGAAT 1500 

TGCTTGGTTC TCCTCCATGT GGCCTCTCCA GCAGGCTAGC TCAGGCTTAT TCACATGATG 1560 

GCTTCAGGAT TCCAAAGAGA GTGAGAGTAG AAGCTGAAAG ACTTCTTGAG TTCTTGGCCT 1620 

GGAACTGGGA CTAGGACAGT GTCACTTCTG CTAAGTTCTT TTGGTCAGAG CAAATCACAA 1680 

GGCTTTACCC AGATTCAAGG GATGAGAAAC AGACTACATG TCTTGATGAG GGGAACCACA 1740 

AAGAGCTTGT GGCCATTTTT CACCTATCAC AAATAATTTT GGATGGGTAT TTATTTGGAT 1800 

AAAGGTATTT CCCTCTTCCC CCTTTCTCTC TGTCTCATGG GGCCTCACTC TGCCAAGTTG 1860 

GAAGGCACTA AGACATTGTC CTGGCCCTCA GGGTCTAGGG GAAGAGGTGT TGGGGCAGGA 1920 

AGTGAGTCTC TCCATGGGCT GGACCCACTG TAGTAGGAGT GCCTCCTTGT CTGCACTGCT 1980 

GGTATGGGGT TAGGCCAGGT AGGACATTCC AGAGGGGCTT CTGAAAACCA AGAGTCCCTG 2040 

GGGAAAGGGA ACAGAGTAAG GCAGGCCTTG TTCTCACTGC CCTCTAAGGG AACTTGGTCA 2100 

CTCGGCACTT TTAAGCCTCA GTTTCTCCAG TTCAATAATA AGGACAAGAG CTTTTCCCAT 2160 

GCATTCTCTT TCCCCGGGAA AGTTGACTGA GGTGACCAGT AATAGAATTG AAAAGGGAGA 2220 

GTGTCTTCAG TGCAATGTGG CATCCTGGAT TGGGTCTTGG AACAAAAACA GGACATTAGT 2280 

GGGAAAATTG GAAATCTGAA AAAAGTCTGA ATTTTAGTTA ATATACCAAT TTCAGTCTCT. 2340 

TGGTTTTGAC AGATGTACCA TGGTGATGTA AGATGTTGAC CTTGGGGTAG GCTGGGTGAA 2400 

GGGTATACAG GAACTCTTTG TACTATCTCT GCAACTTCTC TGTAAATCTA GTATCATTCC 2460 
AAAATAAAAG TTTATTTAAT TTAAAAAAAA AAAAAAAAAA AA 



Seq ID NO: 224 Protein sequence: 
Protein Accession ft: AAH17001.1 

1 11 21 31 41 51 

I I I I I I 

TLGRAGAGRG APEGPGPSGG AQGGSIHSGR IAAVHNVPLS VLIRPLPSVL DPAKVQSLVD 60 
TIREDPDSVP PIDVLWIKGA QGGDYFYSFG GCHRYAAYQQ LQRETIPAKL VQSTLSDLRV 120 
YLGASTPDLQ 

Seq ID NO: 225 DNA sequence 
Nucleic Acid Accession #: NM_021048 
Coding sequence : 1 . . 1110 



1 11 21 31 41 51 

I I I I I I 

ATGCCTCGAG CTCCAAAGCG TCAGCGCTGC ATGCCTGAAG AAGATCTTCA ATCCCAAAGT 60 

GAGACACAGG GCCTCGAGGG TGCACAGGCT CCCCTGGCTG TGGAGGAGGA TGCTTCATCA 120 

TCCACTTCCA CCAGCTCCTC TTTTCCATCC TCTTTTCCCT CCTCCTCCTC TTCCTCCTCC 180 

TCCTCCTGCT ATCCTCTAAT ACCAAGCACC CCAGAGGAGG TTTCTGCTGA TGATGAGACA 240 

CCAAATCCTC CCCAGAGTGC TCAGATAGCC TGCTCCTCCC CCTCGGTCGT TGCTTCCCTT 300 

CCATTAGATC AATCTGATGA GGGCTCCAGC AGCCAAAAGG AGGAGAGTCC AAGCACCCTA 360 

CAGGTCCTGC CAGACAGTGA GTCTTTACCC AGAAGTGAGA TAGATGAAAA GGTGACTGAT 420 

TTGGTGCAGT TTCTGCTCTT CAAGTATCAA ATGAAGGAGC CGATCACAAA GGCAGAAATA 480 

CTGGAGAGTG TCATAAAAAA TTATGAAGAC CACTTCCCTT TGTTGTTTAG TGAAGCCTCC 540 

GAGTGCATGC TGCTGGTCTT TGGCATTGAT GTAAAGGAAG TGGATCCCAC TGGCCACTCC 600 

TTTGTCCTTG TCACCTCCCT GGGCCTCACC TATGATGGGA TGCTGAGTGA TGTCCAGAGC 660 

ATGCCCAAGA CTGGCATTCT CATACTTATC CTAAGCATAA TCTTCATAGA GGGCTACTGC 720 

ACCCCTGAGG AGGTCATCTG GGAAGCACTG AATATGATGG GGCTGTATGA TGGGATGGAG 780 

CACCTCATTT ATGGGGAGCC CAGGAAGCTG CTCACCCAAG ATTGGGTGCA GGAAAACTAC 840 

CTGGAGTACC GGCAGGTGCC TGGCAGTGAT CCTGCACGGT ATGAGTTTCT GTGGGGTCCA 900 

AGGGCTCATG CTGAAATTAG GAAGATGAGT CTCCTGAAAT TTTTGGCCAA GGTAAATGGG 960 

AGTGATCCAA GATCCTTCCC ACTGTGGTAT GAGGAGGCTT TGAAAGATGA GGAAGAGAGA 1020 

GCCCAGGACA GAATTGCCAC CACAGATGAT ACTACTGCCA TGGCCAGTGC AAGTTCTAGC 1080 
GCTACAGGTA GCTTCTCCTA CCCTGAATAA 

Seq ID NO i 226 Protein sequence: 
Protein Accession ft: NP_0663 86 

1 11 21 31 41 51 

MPRAPKRQRC MPEEDLQSQS ETQGLEGAQA PLAVEEDASS STSTSSSFPS SFPSSSSSSS 60 
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SSCYPLIPST PEEVSADDET 
QVLPDSESLP RSEIDEKVTD 
ECMLLVFGID VKEVDPTGHS 
TPEEVIWEAL NMKGLYDGME 
RAHAEIRKMS LLKFLAKVNQ 
ATGSFSYPE 



PCT/US02/12476 



PNPPQSAQIA CSSPSWASL PLDQSDEGSS SQKEESPSTL 
LVQFLLPKYQ MKEPITKAEI LESVIKNYED HFPLLFSEAS 
FVLVTSLGLT YDGMLSDVQS MPKTGILILI LSIIPIEQYC 
HLIYGEPRKL LTQDWVQENY LEYRQVPGSD PARYEFLWGP 
SDPRSPPLWY EEALKDEEER AQDRIATTDD TTAMASASSS 



Seq ID NO: 227 DNA sequence 

Nucleic Acid Accession #: NM_005025.1 

Coding sequence: 82-1314 



GCGGAGCACA 
GAGGCTTGAA 
AGTATGGCTA 
TATAATCGTC 
GCTCTTGCAA 
CACTCAATGG 
TCAAACATGG 
GTGCAAAATG 
GCAGCAGTAA 
TGGGTGGAGA 
GCTGCCACTT 
TTTAGGCCTG 
ATTCCAATGA 
GAAGCTGGTG 
ATGCTGGTGC 
CAGCTGGTTG 
AGGTTCACAG 
GAAATTTTCA 
TCCAAAGCAA 
GTCTCAGGAA 
CATCCATTTT 
GTCATGCATC 
TTATTTGAAT 
TAGGATTTGT 
AATATATGTA 
TGTTATGTCA 



11 
I 

GTCCGCCGAG 
ACTGTTACAA 
CAGGGGCCAC 
TTAGAGCCAC 
TGGGAATGAT 
GATATGACAG 
TAACTGCTAA 
GATTTCATGT 
ATCATGTGGA 
ATAACACAAA 
ATCTGGCCCT 
AAAATACTAG 
TGTATCAGCA 
GTATCTACCA 
TGTCCAGACA 
AAGAATGGGC 
TGGAACAGGA 
TCAAAGATGC 
TTCACAAGTC 
TGATTGCAAT 
TCTTTCTTAT 
CTGAAACAAT 
AACAAGGAAA 
GTTTTACAGT 
AATTATAAGT 
TTGTGTTTGT 



21 
I 

CACAAGCTCC 
TATGGCTTTC 
TTTCCCTGAG 
TGGTGAAGAT 
GGAACTTGGG 
CCTAAAAAAT 
AGAGAGCCAA 
CAATGAGGAG 
CTTCAGTCAA 
CAATCTGGTG 
CATTAATGCT 
AACCTTTTCT 
AGGAGAATTT 
AGTCCTAGAA 
GGAAGTTCCT 
AAACTCTGTG 
AATTGATTTA 
AAATTTGACA 
CTTCCTAGAG 
TAGTAGGATG 
CAGAAACAGG 
GAACACAAGT 
ACAGTAACTA 
ATATCTTAAG 
AACTTGTCAA 
GTGCTGTTGT 



31 
I 

AGCATCCCGT 
CTTGGACTCT 
GAAGCCATTG 
GAAAATATTC 
GCCCAAGGAT 
GGTGAAGAAT 
TATGTGATGA 
TTTTTGCAAA 
AATGTAGCCG 
AAAGATTTGG 
GTCTATTTCA 
TTCACTAAAG 
TATTATGGGG 
ATACCATATG 
CTTGCTACTC 
AAGAAGCAAA 
AAAGATGTTT 
GGCCTCTCTG 
GTTAATGAAG 
GCTGTGCTGT 
AGAACTGGTA 
GGACATGATT 
AGCACATTAT 
ATAATATTTA 
GGAATGTTAT 
TTAAAATAAA 



41 
I 

CAGGGGTTGC 
TCTCTTTGCT 
CTGACTTGTC 
TCTTCTCTCC 
CTACCCAGAA 
TTTCTTTCTT 
AAATTGCCAA 
TGATGAAAAA 
TGGCCAACTA 
TATCCCCAAG 
AGGGGAACTG 
ATGATGAAAG 
AATTTAGTGA 
AAGGAGATGA 
TGGAGCCATT 
AAGTAGAAGT 
TGAAGGCTCT 
ATAATAAGGA 
AAGGCTCAGA 
ATCCTCAAGT 
CAATTCTATT 
TCGAAGAACT 
GTTTGCAACT 
AAATAGTTCC 
CAGTATTAAG 
AGTACCTATT 



Seq ID NO: 228 Protein sequence: 
Protein Accession fh NP_005016.1 



MAFLGLFSLL 
ELGAQGSTQK 
NEEFLQMMKK 
INAVYFKGNW 
VLEIPYEGDE 
IDLKDVLKAL 
SRMAVLYPQV 



11 
I 

VLQSMATGAT 
EIRHSMGYDS 
YFNAAVNHVD 
KSQFRPENTR 
ISMMLVLSRQ 
GITEIFIKDA 
IVDHPFFFLI 



21 
I 

FPEEAIADLS 
LKNGEEFSFL 
FSQNVAVANY 
TFSFTKDDES 
EVPLATLEPL 
NI/TGLSDNKE 
RNRRTGTILF 



31 

1 

VNMYNRLRAT 
KEFSNMVTAK 
INKWVENNTN 
EVQIPMMYQQ 
VKAQLVEEWA 
IFLSKAIHKS 
MGRVMHPETM 



41 

I 

GEDENILFSP 
ESQYVMKIAN 
NliVKDLVSPR 
GEFYYGEFSD 
NSVKKQKVEV 
FLEVNEEGSE 
NTSGHDFEEL 



Seq ID NO: 229 DNA sequence 
Nucleic Acid Accession #i NM_003695 
Coding sequence: 12-398 



CGACATCAGA 
CAGCCCTTAC 
TCTGCCCGGC 
ATCTGGTGAA 
TCAGCAGCGG 
ACAACGCTGC 
TGAGCCTCCT 
TCATGCCTTT 
GGGTGCCAGG 
CATGGAATGC 
ACAGAGGATG 
GATTTCACAC 
TAAATGATTT 



11 

I 

GATGAGGACA 
CCTGCGCTGC 
CAGCTCTCGC 
GAAGGACTGT 
CACCAGCTCC 
ACCCACCCGC 
GGCCGTCATC 
CCTTCCCTTT 
AGCCCCAGGC 
TGATGACTTG 
CAGCCCCCAG 
TCCTTCTGTT 
AAACC 



21 
I 

GCATTGCTGC 
CACGTGTGCA 
TTCTGCAAGA 
GCGGAGTCGT 
ACCCAGTGCT 
ACCGCCCTCG 
TTAGCCCCCA 
CTCTGGGGAT 
TGAGGGCTTC 
GAGCAGGCCC 
CTGCATGGAA 
TTGTTGCCGT 



31 

I 

TCCTTGCAGC 
CCAGCTCCAG 
CCACGAACAC 
GCACACCCAG 
GCCAGGAGGA 
CCCACAGTGC 
GCCTGTGACC 
TCCACACCTC 
CCCGAAAGTC 
CACAGACCCC 
GGTGGAGGAC 
TTATTTTGTA 



41 

I 

CCTGGCTGTG 
CAACTGCAAG 
AGTGGAGCCT 
CTACACCCTG 
CCTGTGCAAT 
CCTCAGCCTG 
TTCCCCCCAG 
TCTTCCCCAG 
TGGGACCAGG 
ACAGAGGATG 
AGAAGCCCTG 
CTCAAATCTC 



51 
I 

AGGTGTGTGG 
GGTTCTGCAA 
AGTGAATATG 
ATTGAGTATT 
AGAAATCCGC 
GAAGGAGTTT 
TTCCTTGTTT 
ATATTTTAAT 
CATCAATAAG 
GGATTTTGAT 
GAAGTCGCAG 
TGAAGTCCAA 
TGGCTCCAAT 
AATAAGCATG 
AGTCAAAGCA 
ATACCTGCCC 
TGGAATAACT 
GATTTTTCTT 
AGCTGCTGCT 
TATTGTCGAC 
CATGGGACGA 
TTAAGTTACT 
GGTATATATT 
AGATAAAAAC 
CTAATGGTCC 
GAACATGTG 



51 
I 

LSIALAMGMM 
SLFVQNGFHV 
DFDAATYLAL 
GSNEAGGIYQ 
YLPRFTVEQE 
AAAVSGMIAI 



51 
I 

GCTACAGGGC 
CATTCTGTGG 
CTGAGGGGGA 
CAAGGCCAGG 
GAGAAGCTGC 
GGGCTGGCCC 
GGAAGGCCCC 
CCGGCAACGG 
TCCAGGTGGG 
AAGCCACCCC 
TGGATCCCCG 
TACATGGAGA 



Seq ID NO: 230 Protein sequence: 
Protein Accession #: NP_003686 



1 11 21 31 41 51 

I I I I I I 

MRTALLLLAA LAVATGPALT LRCHVCTSSS NCKHSWCPA SSRFCKTTNT VEPLRGNLVK 
KDCAESCTPS YTLQGQVSSG TSSTQCCQED LCNEKLHNAA PTRTALAHSA LSLGLALSLL 
AVILAPSL 



120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
3 60 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1360 
1440 
1S00 



60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 



60 
120 



Seq ID NO: 231 DNA sequence 
85 Nucleic Acid Accession !h Eos sequence 
Coding sequence: 126-752 
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1 11 21 31 41 51 

I I I I I I 

CCGGGCAGGT GGCTCATGCT CGGGAGCGTG GTTGAGCGGC TGGCGCGGTT GTCCTGGAGC 60 

AGGGGCGCAG GAATTCTGAT GTGAAACTAA CAGTCTGTGA GCCCTGGAAC CTCCACTCAG 120 

AGAAGATGAA GGATATCGAC ATAGGAAAAG AGTATATCAT CCCCAGTCCT GGGTATAGAA 180 

GTGTGAGGGA GAGAACCAGC ACTTCTGGGA CGCACAGAGA CCGTGAAGAT TCCAAGTTCA 240 

GGAGAACTCG ACCGTTGGAA TGCCAAGATG CCTTGGAAAC AGCAGCCCGA GCCGAGGGCC 300 

TCTCTCTTGA TGCCTCCATG CATTCTCAGC TCAGAATCCT GGATGAGGAG CATCCCAAGG 360 

GAAAGTACCA TCATGGCTTG AGTGCTCTGA AGCCCATCCG GACTACTTCC AAACACCAGC 420 

ACCCAGTGGA CAATGCTGGG CTTTTTTCCT GTATGACTTT TTOGTGGCTT TCTTCTCTGG 480 

CCCGTGTGGC CCACAAGAAG GGGGAGCTCT CAATGGAAGA CGTGTGGTCT CTGTCCAAGC 540 

ACGAGTCTTC TGACGTGAAC TGCAGAAGAC TAGAGAGACT GTGGCAAGAA GAGCTGAATG 600 

AAGTTGGGCC AGACGCTGCT TCCCTGCGAA GGGTTGTGTG GATCTTCTGC CGCACCAGGC 660 

TCATCCTGTC CATCGTGTGC CTGATGATCA CGCAGCTGGC TGGCTTCAGT GGACCAAATT 720 

TTCAGGATGG CTGTATTCTG CGGTCAGAAT GAGAGAGTCA AGCTGGGCAG AATCTCTCGC 780 

CAAGAGTTCA GCCTTCCTTT GGAGACTGCT CCATCAGTGC CGAGGTGTGT GGGAACAGGC 840 

TTCACTGCAC CGCCATCTTA CTGAGTTGCT TCACGTGAGG AAAAGGGGGC TTTGGCCCTG 900 

TGACTCAGTT CCACATTTTG GATTGCATAC TGGAAAAGAA GCCAATCTTC TTGCTAGTAA 960 

ACCAGCAACC CGGCTGTATA CAGTGGTGAC CCAAGCAATG GATATAAACC TAAAAATCTG 1020 

AGGGAGGGGA GAGGTGGAAT ACAGTAGTTC TTGGAATCTG AAGTCTCCTA TTTGATCAGG 1080 

TTATTTCCTG GGACTTGGCA AAAATCTGAT TGGTGGGGAT CTCCTAGGAC CTAGTGGACA 1140 

TCTGGTATTA ATTTAATCTC AGGAAAAACA AGAAATTAAC CCAGAGAGAG TCTGGGTTTT 1200 

GGAATTCAGC GTAGCTACCT CCAGACCGTG GTGTCTGGCC TCCATTTTTG TCTGTCATTC 1260 

AGCTCTGACT TACAGCTGCA GTCACCTTTG CTATAAGGCA CCTGGGTAGA AGGGTGGATG 1320 

GGCTTCACAT CAATTTTTTT CTTCCTTTAG GGTGGGGGAT TGGTTTGGCT TTCTTTTGTT 1380 

GTGGTTTTTT GTTTTATTTT TGTCAAGATT GATTTTTAGA TGCAAGGACT TGAAAAGACC 1440 

CAGAAGGATG CCACCAGTTT TTCCTTGAGG CCTAGGATTT TTTATTCTGT CCCGAGCAGA 1500 

GGTAATTCCT CACAACTTAG TGCACCAGTA GCACCAGCCA TTTTGAGCAG AGTACCTCTT 1560 

TGGGGAGCTT TTCGTTTTGT TTTGTTTTTA ATTCTCTTTC CTTAGCAGCA AGGTCTTTTT 1620 

TCCTAGAGAA TCTACTCCGT TGCAGAATCA TTGCAACCTC AGGAGCCCTC ACTGATTGAG 1680 

TGCTGTCAGC CTGATATACT ACTTTGGACT CTGGAAACAG ATATGGGTTC TATTCTCTAT 1740 

TTCTACTGTG TGTCGTTAAA CAACCGTCGG AGACCAGATG ACCTGTTAGA TGGCTAGTCC 1800 

TGTATAACTC GACTCTGTAT GTTTCAATGT ATGTTACTGC AATGCTTCAC CTGCTGTACA 1860 
GTGTTTGTGA GATGCTCTTT GAAGATGGTA CTTTTATATT T 

Seq ID NO: 232 Protein sequence: 
Protein Accession # : Eos sequence 

1 11 21 31 41 51 

I I I I I I 

MKDIDIGKEY IIPSPGYRSV RERTSTSGTH RDREDSKFRR TRPLECQDAL ETAARAEGLS 60 

LDASMHSQLR ILDEEHPKGK YHHGLSALKP IRTTSKHQHP VDNAGLFSCM TFSWLSS1AR 120 

VAHKKGELSM EDVWSLSKHE SSDVNCRRliE RLWQEELNEV GPDAASLRRV VWIFCRTRLI 180 
LSIVCLMITQ IiAGFSGPNFQ DGCILRSE 

Seq ID NO: 233 DNA sequence 

Nucleic Acid Accession ft : CAT cluster 

1 11 21 31 41 51 

I I I I I I 

TTTTAATGGT GCTCATATAT ACTGTATTTT TTGTTGTTTA GTTTTACTTA TTGAGAGTGT 60 

CACAACATGA ATCACATAAT CATGATTTTT TTTTTTTACT TTTACTCCCC AAATTATTCA 120 

TGTTTCTTAG ATOGTAGTCA TTGAGAAGTC CCAATAACTC TAAACTTTTG AGTTATAACG 180 

TAGTAAACTT CTCTTTCATC TTTGTGTTAG CTCTGTAGTC TTAACCTGGA TTTTAATTTT 240 

TTTGTTTCCA AAGTCACAAT TGAATTATTC TTAGATACCT TAAGCCACTG AATTCAGTTC 300 

TGTTTGACTG AAAGCAAAAC AACGTGACAG TTTATTTTCA AACACTAACT TCTTGATATT 360 

TTGTTATGGT ATATCTTTTT ATTAAATATT TATTTTGACT AAGCTTTCAT AAAATATTTG 420 

AAGCTATTTT AATCATCAAG TATGGAAAAC AAATTACTAT TGCATTTTCC TATATATGCA 480 

TATATTATGG ATTAACCAGA ATTGTATCAT TTTTGGCCTA ATGTCTGGAT ATAAAAGATA 540 

ATTAGCCTAC TATAGTATTA ATAAATTTTT CAGTTGGTTT GGGCAAATTT AAACCTGAAA 600 

AATAGGTTAA AAAGTAGTTA CAAATTAAAC TTACTAATTT ATACCTGATT TTTTTTCTTG 660 

AATTAAAGTA CATTTTAAAT GAGCTTTATA ATACCTTAAA AAGTTGGTTC TAATTTAAAA 720 

TATGAAAGCT CTGGCTATCA TCCTGGGATA GTAATTTCTA ATTATATAGT ATTTCAAAAC 780 

TATATATTTT TTAGTTCCTT TGAGATAACT AATTTCTAAT TATATATGTT TCAAAAACCA 840 

TATCCTGTAT TTTTTTTAAG AATTGTTTTA TAAATAGGTC ATAAGATACA AGGTCTGCAT 900 

TAGAAGACCC ACTCTTACTA GGTTCCCTAA GGATCTGCCA TAGATTTTTT TTTTTTTTTT 960 

TTTTTTTTAG GTAGTTTAAA GCAAGCACTG ATACCAGTGG GAGTTGGTCT TGATCTAGGA 1020 

GATTCTGTTA AGCATCCAAA AACAATGCCT AATTTCAGTT CTTAGGTTAT GGCTTGTGAC 1080 

TCCAGATAAA AGATGGAGAA TACCTCATGT ACTGTGACTT GAAAATGAAT TCTTAAAATT 1140 

CTTAGGCTCT CTCCATGTAT CTTTCTTAAG GAAAAGTTTC TGAGTGTGAT CTCTCTTTTG 1200 

CCATAGTATC AAGTGGAGGG TAGTTCAGAA AAGTTAATAG GAAATCTTTT GTGACAGCAG 1260 

ACTATAATAG AAGTTTGAGT AATATTTTAA TAAATTTATA TAATTCAAAT GATAAAAATG 1320" 

TATCAATGTT ATCCAATGAT TTTTATTAAA AAATTACCTT ATTATTAGAA CTGTGCCTAT 1380 

TACATAAAAA GTGCTCATGT ATTTGAATTT TAAATAATTT ATTTAAATCA AGACCACCAT 1440 

AAGTCATTAA TAATTTAATA ATTGTTTTAA ATCAGTGGTT TTCAACCCTC ACTTCATATT 1500 

AGAATCATCT GAGGACTTTT AATATGGAAT CCACCTCATA ACAATTAAGT CTAAATTTCT 1560 

GGAAGATGGA GCCATGCTTG TTTTTCCAAA AGCTCTTTGA GTGATTCTAA TTTGTAGTCA 1620 

GAGTTGAAGA CCACTGCTCT AAATTAGTGC AGGAAAATGC TTTTATTTCT CCCATGTTAA 1680 

CTTTTAAAAC TAGTAATGTA CCCAGTTAAG TTTTGATGGT TTAAATTCCA CTAAAGAACA 1740 

TATTCTTCTA ATAACTAGCA TTTATTACAT GAAATTTAAG AGTTTAAGTT CCATCAAACT 1800 

AGCCCTTGTG TAAGATTATT ATTTCTTCTC TATAACTTCA AAATAGATAT TTCATTCAAA 1860 

CTGTTCAGGT GAGAAAACAT AATGGATTTT TTTTTTTTTC CTCTGGAGCT GCCTGTTCAG 1920 

TGAGATGGAG GAGGTGGGCA CATTTAAGGT CAGTTCACTA ACCTATGGTT CAGAGTTCTG 1980 

ATCATATGGA AGTTTGGAAA AGAGAGCTTA TCACAGGTTT GTATGCTGGT GAATGGATAG 2040 

TTTTAATTCT CACTGTCTCA AAAGAGAATC AGCTCTCCAG CAGTTCTAGA AAAGCTTTGA 2100 

CAATCCCCAA GGGGCAGTGT TACCTTACTC CTTCACTGCT TCTTAGAAGG TAGAATTAAG 2160 

TTTCTGGAAT TGCACCTACA TGTTTTCTTA TTAACATTCA GAATTGGGAA TATTAATTTT 2220 
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TCCAGTGAGT AGTTTTCTGA AATTGGTAAC TTGGAGAGTA AAATAACGTA TTTTGCTTTT 2280 

CAATTTTGTG TTTGTTTACT TTTATGTAAA AATTTGATAT GTGAATTACA CAGTTCTAAT 2340 

AAAACCTCAT GCCTTTTCAT TACATCTAAT TTGAACTCTC AACTTCAGTG CCA GAAG TGC 2400 

TTTAAAGATG CTTTAATGAA AAGTATTAAG AAAATATATA GATTTGTATG TCAGTTTATA 2460 

CTTCAGAAAT CCATATATTT GTCATATTTA TTTTTTTAGA AACCTCCTAA TTGGATAACT 2520 

AGATGGTATT TAAAATGAAT GCCCAAAAAT ATCTTGTACC TTTGTCCAAA AGTTTATCTG 2580 

TTGGAAGCCG CCAGCCATTC ATGTAGAGAG TTTATAAGAA AATAATTTAA AATTGTATGC 2640 

ATTTTATATT ACTATGGTAT CTGTGTACCA TATTTCTAAG TATTCATTAT TAAATTGGTA 2700 

CTTCTTAAAA CCATAACCTG GCTTGCCTTT TAGTGTTAAA CACAAAATCC AACATTGTAT 2760 

ATAGAGATTC TTCTTTTATG AAGAAGAGCT GACGTAATTT ATTACCAGTG CATCTGCACA 2820 

AAGACATTAA CATAAGTCTC TGAGCAGTGA TACATTTTCA AACATGAAGA GTGACAACCA 2880 

CCACATTAAA CAACCACGGC AACACTCAGA CTTGGCACTT TCCTACGAAT CCATCCTATA 2940 

TGTGCCTGGT ATCGCCTCTG GCATAACTTA CACGAATCGT CCTCCCTACT TGTCTACGCT 3000 

CCTTCATCAA GCACTTGCCA ACACATTCAC CTCTAACTTG TACAACCTTA CCAACTCACC 3060 

ACAACATCTG CAACTCTACC CTATCAACTG CCAACCTAAA GACCCCCAAC ACAACACAAC 3120 

CCCCAAACAC AAAACCACTA AATCATAACC ACCACACACG CCACACACCA CACACCCACC 3180 
CACACAACCA ACACACCACG ACCAAACACC CCACCACAAA CAAGCTAACA ACCACAAACA .3240 
GACAACACAT CACATACACT CACTACCCCC CCATACTCCC ACCCACCA - 

Seq ID NOi 234 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 27-281 

1 11 21 31 41 51 

AGCAGGAGGA GAGCTGGCGG GAAGACATGC ACCCCTTGAA GACCCAGAGA GAGGCCGTCT 60 

GTCTACCGCG TAGCAGTTAC ATCAGACTGA GACACTTCCT GTTTACAGGA GACTATAAAA 120 

TTCCTGCCCC GTGCTCATTT GGGGCTGACG CCATTTTAGG CCTCAGCCCA TCTGCACCCA 180 

GGCGCTCACT GAAACAGTGT GTTGCTCCAC ACCGCCTTGT TTTGCTTGTT GGCGCGCTCT 240 

CAGGGTTCCG ACCAATCCAA GAGCCTTGCA GAAAGCATTA ACGTGCTTTT CTCTTTGGCA 300 

GAGTTTTTCT TTGCTCTGAT CTTGGAGACA TCCCTCTGCC TAGTGGAAAC ATAAGGAATA 360 

CAGAAAGAAT GCAAGGAGAT AGACCAACGT GAGATTCTCC TTCATGCACT CAAGAGAAAG 420 

ATGTTGCAGG AAGAGCTAGT CTTTCAGGCT GGGCTGGTGA CCTGAGAAAG AATGTCCAGC 480 

TTTTCTTCTC CACTTGGCAT ATCAAGAGCC AGGCGTGGAA GACTAAAACA GGAAATGTTT 540 

ATAAAAACTG TTCAGCGGTT CGCCAACAAG AAGTGGTAAA GTAGCAAAAA TGGGGATGGA 600 

GATGCCAGGA GGAAAGATGC CAGGGGTAAA GTGGGAAAAT GGGAACCTGA AGCCAGGAGG 660 

TCAAGCCAAG CCAACAGGTG TTCTGTTTTT CATCACAGAA CTAATAAGTG GTGCTGAGGA 720 

CTCAAACCCG GGGAAGCCCA CTCTAGAACC CATGCTGGTC ATCCATATCC CCAAGGCCCT 780 

GGTCAGAACA CAGCTAAGCA GATGGCTTGG GTCATCAGGA CGTCCATTAC ATCCAAAGGA 840 

AGACAGCCTG TGACGTTTCA AAAGCAAAAG TCCCCTACCA GCCAGTGAAG CTACCTGATT 900 

TCTCAGTATC TTACGCCCAG TGACACGATC TACCCTCAAA ACTTAAAAAA AAAAGGGAAA 960 

CATAAACACA TAACAGCAGC AGCAATAATT AAAGATGAGA TGAGAACAAT TAAGAAAAAA 1020 

GGAAAGGTCT CCTGTGACTG TTTTATTTTT AGGGAAACAG AGAGGAAGAA GAATGATTTT 1080 

TCTTTTGATG ACTCTATATC CAACTCTGAG GTTTGATTAA AGAAATGACC TTGAACCACA 1140 

GCAAAGAAAA ATAAAAGACA ATTTCCAGTA AGTATGCCAG TTCGAATTAA TGATTTACTT 1200 

TTTATTTTTA AACTGAATTC AGCAGAGATT TACATGCATT ACGATGATTA ACATCTGAAA 1260 

TTTGACCTTG AAATAATCTT TACATTGTAA ATTCTTAATG ATCAAAACAA GGTTCTCAGT 1320 

GATTAAAACA TATTAGTAAT TAATTATTAA AGGAGAATAA TTGCAAATAC AACATTCCTA 1380 

AAATCTCAAG GCTTTTAAAG CATTTGTACA AATGACTGGA CATTTTTTAA ATTTGAAAAA 1440 

AAAAAAAAGC CCTCCATCTG ATTCTCATTT TCATTGTCAG TGCAACAACA AAAAAGGTAT 1500 

GCACTTCTCT TCTCATTTTC CACTGTCTCG CAAGCTAGAA ATTCTCACGA CTACCTTTGA 1560 

TCCCATCAAA GCCAAAGAAA GAAAAGAAAA TTGTTCTGTA CAGATATATG ACATTAAAAA 1620 
ATAATCCC 



Seq ID NO; 235 Protein sequence: 
Protein Accession #: Eos sequence 

1 li 21 31 41 51 

i I I I I I 

MHPLKTQREA VCLPRSSYIR LRHFLFTGDY KIPAPCSFGA DAILGLSPSA PRRSLKQCVA 60 
PHRLVLLVGA LSGFRPIQEP CRKH 

Seq ID NO: 236 DNA sequence 

Nucleic Acid Accession #: NM_00207S 

Coding sequence: 406.. 1428 

1 11 21 31 41 51 

CCACAATAGG GGCAGACCTG TCCATCCTTC TCTGTGGGTC CCCTGTACCT TTCTCCCCCA 60 

ACAGGATCAG ACCCAGAGGC AGCTGGTTGG GGTTTGTCGA GAAGAAGGAT TATCCAGATC 120 

AGTCCTTTCT AATCTCAGCT CCTGCCTGTA CCCTCCCATA CTCACCAAAC CCTCTTCCCC 180 

ACCACCCTGA GCTGAGGAGC ACAGTTTGAG GCCCCCCCAA CCCCCCGCCG GTCGGGGCCA 240 

GGCCAGGCCA GGCCAGCTCC TCTGGCAGCA GAGCCTGGGC AGGTGACGGG CGGGCGCGGG 300 

CGTCGCAGCT GAGGGAGTAA GGAGGCTCCC AGGAACCGGA GCTGGAAACC CGGCCGAGGT 360 

CCAGCCAGAG CCCAAGAGCC AGAGTGACCC CTCGACCTGT CAGCCATGGG GGAGATGGAG 420 

CAACTGCGTC AGGAAGCGGA GCAGCTCAAG AAGCAGATTG CAGATGCCAG GAAAGCCTGT 480 

GCTGACGTTA CTCTGGCAGA GCTGGTGTCT GGCCTAGAGG TGGTGGGACG AGTCCAGATG 540 

CGGACGCGGC GGACGTTAAG GGGACACCTG GCCAAGATTT ACGCCATGCA CTGGGCCACT 600 

GATTCTAAGC TGCTGGTAAG TGCCTCGCAA GATGGGAAGC TGATCGTGTG GGACAGCTAC 660 

ACCACCAACA AGGTGCACGC CATCCCACTG CGCTCCTCCT GGGTCATGAC CTGTGCCTAT 720 

GCCCCATCAG GGAACTTTGT GGCATGTGGG GGGCTGGACA ACATGTGTTC CATCTACAAC 780 

CTCAAATCCC GTGAGGGCAA TGTCAAGGTC AGCCGGGAGC TTTCTGCTCA CACAGGTTAT 840 

CTCTCCTGCT GCCGCTTCCT GGATGACAAC AATATTGTGA CCAGCTCGGG GGACACCACG 900 

TGTGCCTTGT GGGACATTGA GACTGGGCAG CAGAAGACTG TATTTGTGGG ACACACGGGT 960 

GACTGCATGA GCCTGGCTGT GTCTCCTGAC TTCAATCTCT TCATTT CGGG GGCCTGTGAT 1020 

GCCAGTGCCA AGCTCTGGGA TGTGCGAGAG GGGACCTGCC GTCAGACTTT CACTGGCCAC 1080 
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GAGTCGGACA TCAACGCCAT CTGTTTCTTC CCCAATGGAG AGGCCATCTG CACGGGCTCG 1140 

GATGACGCTT CCTGCCGCTT GTTTGACCTG CGGGCAGACC AGGAGCTGAT CTGCTTCTCC 1200 

CACGAGAGCA TCATCTGCGG CATCACGTCC GTGGCCTTCT CCCTCAGTGG COGCCTACTA 1260 

TTOGCTGGCT ACGACGACTT CAACTGCAAT GTCTGGGACT CCATGAAGTC TGAGCGTGTG 1320 

GGCATCCTCT CTGGCCACGA TAACAGGGTG AGCTGCCTGG GAGTCACAGC TGACGGGATG 1380 

GCTGTGGCCA CAGGTTCCTG GGACAGCTTC CTCAAAATCT GGAACTGAGG AGGCTGGAGA 1440 

AAGGGAAGTG GAAGGCAGTG AACACACTCA GCAGCCCCCT GCCCGACCCC ATCTCATTCA 1500 

GGTGTTCTCT TCTATATTCC GGGTGCCATT CCCACTAAGC TTTCTCCTTT GAGGGCAGTG 1560 

GGGAGCATGG GACTGTGCCT TTGGGAGGCA GCATCAGGGA CACAGGGGCA AAGAACTGCC 1620 

CCATCTCCTC CCATGGCCTT CCCTCCCCAC AGTCCTCACA GCCTCTCCCT TAATGAGCAA 16B0 

GGACAACCTG CCCCTCCCCA GCCCTTTGCA GGCCCAGCAG ACTTGAGTCT GAGGCCCCAG 1740 

GCCCTAGGAT TCCTCCCCCA GAGCCACTAC CTTTGTCCAG GCCTGGGTGG TATAGGGCGT 1800 

TTGGCCCTGT GACTATGGCT CTGGCACCAC TAGGGTCCTG GCCCTCTTCT TATTCATGCT 1860 

TTCTCCTTTT TCTACCTTTT TTTCTCTCCT AAGACACCTG CAATAAAGTG TAGCACCCTG 1920 
GT 

Seq ID NO: 237 Protein sequence: 
Protein Accession gi NP_002066 

1 11 21 31 41 51 

| | | I I I 

MGEMEQLRQE AEQLKKQIAD ARKACADVTL AELVSGLEW GRVQMRTRRT LRGHLAKIYA 60 

MHWATDSKLL VSASQDGKLI VWDSYTTNKV HAIPLRSSWV MTCAYAPSGN FVACGGLDNM 120 

CSIYNIiKSRE GNVKVSRELS AHTGYLSCCR FLDDNNIVTS SGDTTCALWD IETGQQKTVF 180 

VGHTGDCMSL AVSPDFNLFI SGACDASAKL WDVREGTCRQ TFTGHESDIN AICFFPNGEA 240 

ICTGSDDASC RLFDLRADQB LICFSHESII CGITSVAFSL SGRLLFAGYD DFNCNVWDSM 300 
KSERVGILSG HDNRVSCLGV . TADGMAVATG SWDSFLKIWN 

Seq ID NO: 238 DNA sequence 

Nucleic Acid Accession #: CAT cluster 

1 11 21 31 41 SI 

! II I I i 

TCCCAATGTG TNGAACCTAC CATAAATTCT TTTCTTACNG GACAATCTTA TNCTAANCAA 60 

TACCATTTGC TTTTAAGGCA GATAATCCTC CAAGTTTTCT AATGATATCT GAAACTATTA 120 

ACTGATTCTG TGAATTATGA AATCTGAAAA GGAATTGGAA GTTGCTAAAA ATCTATCATT 180 

TGCATTGACC AGTGTGAAGC ACAGTGGAAT GAGAATGCGT GCCCTGACAC CAAAGAAAAA 240 

TAAGTGACTG GAAAGCTGAA GAATCACCGG CTTCAGTGAC ATGGAACCCA GTGATTTGAT 300 

TTTTGACGAG TATCGGGTGA CTTTGAGGTG GTCAAGAAAC CACACTTTAA GAACAATGTC 360 

CAAAAAGGGG AAAAAAAAGA GCAACCAAAG AAAAAAAATC CATAAAATTG CACAGAAGAA 420 

AAGAAAGAAA AATAAAATAC ACAATATGGA CGATGGAGAA AAACAGTTAC ATTTCTTTAT 480 

GGATCAAGAA GTTTGTGTAC ACATAATCTC ATTTTGAGAT ATATAACTAT TTTTGTCTTT 540 

CAGAAGTGAA TCAAAATATT TCAAAATGCT GTCTTATGAA ACTACAATAT TCTCACAGAT 600 

TAGAAAAGTT TTTCTGTAAA AGTCAGATAG TAAATATTTT AGGTTTTGCA GTGTCTTTTG 660 

CAACTACTCA ACTTTCCTAC TGTAGCACAA GAGTAGCTGT GGTACTGTGC AAATAAATTG 720 

CTTGTGTTCC AATAAAGCTT CATTTACAAA AACATGCCAT GGGCCATATT TGGCCTGTAC 780 

ACTGTTGTTT GCCAAGTCCT AATATAGTTG CTTAGCAAGT ATTGTGAGCT ATTTGAGGAA 840 

GACATGAAAG TTCATTGGGT TGCTAAAAAG TATGTAGAAA TTCAAAGGAA AATTAAAATT 900 

TAGGCTAAGT TATAATACAC TGTTTTAACA ATTGTAAAAT GTAAGAGAAA TTTACAAATA 960 
AAAATCCCAA ATAAAA 

Seq ID NO: 239 DNA sequence 

Nucleic Acid Accession #: NMJ)01786.1 

Coding sequence: 130-1023 

1 11 21 31 41 • 51 

GGGGGGGGGG GGCACTTGGC TTCAAAGCTG GCTCTTGGAA ATTGAGCGGA GAGCGACGCG 60 

GTTGTTGTAG CTGCCGCTGC GGCCGCCGCG GAATAATAAG CCGGGATCTA CCATACCCAT 120 

TGACTAACTA TGGAAGATTA TACCAAAATA GAGAAAATTG GAGAAGGTAC CTATGGAGTT 180 

GTGTATAAGG GTAGACACAA AACTACAGGT CAAGTGGTAG CCATGAAAAA AATCAGACTA 240 

GAAAGTGAAG AGGAAGGGGT TCCTAGTACT GCAATTCGGG AAATTTCTCT ATTAAAGGAA 300 

CTTCGTCATC CAAATATAGT CAGTCTTCAG GATGTGCTTA TGCAGGATTC CAGGTTATAT 360 

CTCATCTTTG AGTTTCTTTC CATGGATCTG AAGAAATACT TGGATTCTAT CCCTCCTGGT 420 

CAGTACATGG ATTCTTCACT TGTTAAGAGT TATTTATACC AAATCCTACA GGGGATTGTG 480 

TTTTGTCACT CTAGAAGAGT TCTTCACAGA GACTTAAAAC CTCAAAATCT CTTGATTGAT 540 

GACAAAGGAA CAATTAAACT GGCTGATTTT GGCCTTGCCA GAGCTTTTGG AATACCTATC 600 

AGAGTATATA CACATGAGGT AGTAACACTC TGGTACAGAT CTCCAGAAGT ATTGCTGGGG 660 

TCAGCTCGTT ACTCAACTCC AGTTGACATT TGGAGTATAG GCACCATATT TGCTGAACTA 720 

GCAACTAAGA AACCACTTTT CCATGGGGAT TCAGAAATTG ATCAACTCTT CAGGATTTTC 780 

AGAGCTTTGG GCACTCCCAA TAATGAAGTG TGGCCAGAAG TGGAATCTTT ACAGGACTAT 840 

AAGAATACAT TTCCCAAATG GAAACCAGGA AGCCTAGCAT CCCATGTCAA AAACTTGGAT 900 

GAAAATGGCT TGGATTTGCT CTCGAAAATG TTAATCTATG ATCCAGCCAA ACGAATTTCT 960 

GGCAAAATGG CACTGAATCA TCCATATTTT AATGATTTGG ACAATCAGAT TAAGA AGATG 1020 

TAGCTTTCTG ACAAAAAGTT TCCATATGTT ATGTCAACAG ATAGTTGTGT TTTTATTGTT 1080 

AACTCTTGTC TATTTTTGTC TTATATATAT TTCTTTGTTA TCAAACTTCA GCTGTACTTC 1140 

GTCTTCTAAT TTCAAAAATA TAACTTAAAA ATGTAAATAT TCTATATGAA TTTAAATATA 1200 
ATTCTGTAAA TGTGAAAAAA AAAAAAAAAA AAAAA 



Seq ID NO: 240 Protein sequence: 
Protein Accession #: NP_001777.1 

1 11 21 31 41 51 

1 | I I I I 

MEDYTKIEKI GEGTYGWYK GRHKTTGQW AMKKIRLESE EEGVP STAIR EISLLKELRH 60 
PNIVSLQDVL MQDSRLYLIF EFLSMDLKKY LDSIPPGQYM DSSLVKSYLY QILQGIVFCH 120 
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SRRVLHRDLK PQNLLIDDKG TIKLADFGIA RAFGIPIRVY THEWTLWYR SPEVLLGSAR 
YSTPVDIWSI GTIFAELATK KPLFHGDSEI DQLFRIFRAL GTPNNEVWPE VESLQDYKNT 
FPKWKPGSLA SHVKNLDENG LDU.SKMLIY DPAKRISGKM ALNHPYFNDL DNQIKKM 



PCTAJS02/12476 



180 
240 



Seq ID NO: 241 DNA sequence 

Nucleic Acid Accession 8: NM_033379. 

Coding sequence: 132-854 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



CGCCOGCGCG 
GCTTTGCAGA 
ATTGACTAAC 
TTGTGTATAA 
TAGAAAGTGA 
AACTTCGTCA 
ATCTCATCTT 
GTCAGTACAT 
TATTGCTGGG 
TTGCTGAACT 
TCAGGATTTT 
TACAGGACTA 
AAAACTTGGA 
AACGAATTTC 
TTAAGAAGAT 
TTTTTATTGT 
AGCTGTACTT 
ATTTAAATAT 



11 
I 

CGGGCTCAAC 
GAGCGCCCTC 
TATGGAAGAT 
GGGTAGACAC 
AGAGGAAGGG 
TCCAAATATA 
TGAGTTTCTT 
GGATTCTTCA 
GTCAGCTCGT 
AGCAACTAAG 
CAGAGCTTTG 
TAAGAATACA 
TGAAAATGGC 
TGGCAAAATG 
GTAGCTTTCT 
TAACTCTTGT 
CGTCTTCTAA 
AATTCTGTAA 



21 

•I 

TTTGTAGAGC 
CAGGGACTAT 
TATACCAAAA 
AAAACTACAG 
GTTCCTAGTA 
GTCAGTCTTC 
TCCATGGATC 
CTTGTTAAGG 
TACTCAACTC 
AAACCACTTT 
GGCACTCCCA 
TTTCCCAAAT 
TTGGATTTGC 
GCACTGAATC 
GACAAAAAGT 
CTATTTTTGT 
TTTCAAAAAT 
ATGTGAAAAA 



31 
I 

GAGGGGCCAA 
GCGTGCGGGG 
TAGAGAAAAT 
GTCAAGTGGT 
CTGCAATTCG 
AGGATGTGCT 
TGAAGAAATA 
TAGTAACACT 
CAGTTGACAT 
TCCATGGGGA 
ATAATGAAGT 
GGAAACCAGG 
TCTCGAAAAT 
ATCCATATTT 
TTCCATATGT 
CTTATATATA 
ATAACTTAAA 
AAAAAAAAAA 



41 
I 

CTTGGCAGAG 
ACACGGGATC 
TGGAGAAGGT 
AGCCATGAAA 
GGAAATTTCT 
TATGCAGGAT 
CTTGGATTCT 
CTGGTACAGA 
TTGGAGTATA 
TTCAGAAATT 
GTGGCCAGAA 
AAGCCTAGCA 
GTTAATCTAT 
TAATGATTTG 
TATGTCAACA 
TTTCTTTGTT 
AATGTAAATA 
AAAAAA 



51 
I 

CGCGCGGCCA 
TACCCATACC 
ACCTATGGAG 
AAAATCAGAC 
CTAT7AAAGG 
TCCAGGTTAT 
ATCCCTCCTG 
TCTCCAGAAG 
GGCACCATAT 
GATCAACTCT 
GTGGAATCTT 
TCCCATGTCA 
GATCCAGCCA 
GACAATCAGA 
GATAGTTGTG 
ATCAAACTTC 
TTCTATATGA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 



Seq ID NO: 242 Protein sequence: 
Protein Accession #: NP_203698.1 

1 11 21 31 41 51 

111)11 
MEDYTKIEKI GEGTYGWYK GRHKTTGQW AMKKIRLESE EEGVPSTAIR EISLLKELRH 
PNIVSLQDVL MQDSRLYLIF EPLSMDLKKY LDSIPPGQYM DSSLVKWTL WYRSPEVLLG 
SARYSTPVDI WSIGTIFAEL ATKKPLFHGD SEIDQLFRIF RALGTPNNEV WPEVESLQDY 
KNTFPKWKPG SLASHVKNLD ENGLDLLSKM LIYDPAKRIS GKMALNHPYF NDLDNQIKKM 

Seq ID NO: 243 DNA sequence 

Nucleic Acid Accession &: AF101051.1 

Coding sequence: 221-856 



I 

GAGCAACCTC 
CGACCCAGAG 
GCGGGGCCCA 
ACCTGCCACC 
GCTGTTGGGC 
GCCCCAGTGG 
CGAGGGGCTG 
TGACTCCTTG 
CATCCTCCTG 
CTTGGAAGAC 
TCTTGCAGGT 
ATTCTATGAC 
TGGCTGGGCT 
CCGAAAAACA 
GAAAGACTAC 
GGACATTGAG 
GTATGGTATT 
AAACATGGCT 
TTGTATTACT 
TATATATAGA 
CTCATTATGT 
CCATATTGAT 
CAGTCAAATA 
CTAATTTACC 
TTATTTTTTA 
TTTCATTGGT 
AGCCAAGAAG 
GTGATAAATT 
TTTGCTTTGA 
CACAACTTTA 
ACCTTTTTGT 
TATATCTTCC 
GATAATCTGG 
TCTTTTTTCT 
AATATTAATT 
TTTATTTGCT 
CTTCATGTGA 
ACACATACCT 
AAACCTACGC 
ATTCTTTCAG 
TTTCCAGTCT 



11 

I 

AGCTTCTAGT 
CTTCTCCAGC 
GCCACCTTCG 
CCTGAGCCAG 
TTCATTCTCG 
AGGATTTACT 
TGGATGTCCT 
CTGAATCTGA 
GGAGTGATAG 
GATGAGGTGC 
CTGGCTATTT 
CCTATGACCC 
GCTGCTTCTC 
ACCTCTTACC 
GTGTGACACA 
ATACTATCAT 
ACAAAACAAA 
TAATCTTATT 
GCTTCCCATT 
TATGTATATA 
TGATACTAGC 
GAAGATGTTT 
TCATTTACTC 
AAGGATGAAT 
CCATAATCTT 
CTCTATCTCC 
AATTTATTAC 
CCTGTTGACC 
AAATATTTGT 
TTGATTGAAT 
TCCCCATTCC 
TAATAAGGTG 
TGACAAATAT 
ATCTGCCAAA 
AGTTTATATT 
CAGCTGGCTG 
TTCACTGCCT 
TCATGTGGTT 
ACATACCTTC 
CTGTGTCTGA 
GTACAGAATG 



21 

i 

ATCCAGACTC 
GGCGGCGCAG 
GGAGTCCGGG 
CGCGGGCGCC 
CCTTCCTGGG 
CCTATGCCGG 
GCGTGTCGCA 
GCAGCACATT 
CAATCTTTGT 
AGAAGATGAG 
TAGTTGCCAC 
CAGTCAATGC 
TCTGCCTTCT 
CAACACCAAG 
GAGGCAAAAG 
TAACATTAGG 
CAAACAAACA 
TTATCTTCTT 
GAGTAATCAT 
TACATGTTTT 
ATACTTAAAA 
ATTGGTATAT 
TTCTTCATTA 
TCTTTCAATT 
ATAGCACTTG 
TGAATCTAAC 
AAATCAGAAC 
TTCCCACACA 
CCAATTGAGT 
TTTTAAGCTA 
TTAATTGTAT 
TGGTCTGTTT 
TCTCTCTGTA 
TTGAGATAAT 
ACTCTCATTC 
AGACACTGAA 
TCCTCTCTCT 
CAGTGCCTTC 
ATGTGGCTCA 
CATGTTTGTG 
CTATTTCACT 



31 
I 

CAGCGCCGCC 
CGAGCAGGGC 
TTGCCCACCT 
CGAGCGAGTC 
ATGGATCGGC 
CGACAACATC 
GAGCACCGGG 
GCAAGCAACC 
GGCCACCGTT 
GATGGCTGTC 
AGCATGGTAT 
CAGGTACGAA 
GGGAGGTGCC 
GCCCTATCCA 
GAGAAAATCA 
ACCTTAGAAT 
AAAAACCCAT 
TCCTCAATAT 
ACTCAAATGG 
TCTATTAAAA 
TATCTCTAAA 
TTTCTTTTTC 
GCTTTGGGTG 
CTTCATGCGT 
CATCGTTATT 
ACATTTCATA 
TTTGGAGGCA 
ATCCCTGTAC 
AGCTGCATGC 
CTTATTCATA 
TGTTTTCCCA 
GTCTGAACAA 
GCTGTAAGCA 
GATACTTAAC 
TTTGAACATG 
GAAGTCACTG 
ACCAGTCTAT 
CTCTCTCTAC 
GTGCCTTCCT 
CTCTGTTCCA 
TGAGCAAGAT 



41 
I 

CCGGGCGCGG 
TCCCCGCCTT 
GCAAACTCTC 
ATGGCCAACG 
GCCATCGTCA 
GTGACCGCCC 
CAGATCCAGT 
CGTGCCTTGA 
GGCATGAAGT 
ATTGGGGGTG 
GGCAATAGAA 
TTTGGTCAGG 
CTACTTTGCT 
AAACCTGCAC 
TGTTGAAACA 
TTTGGGTATT 
GTGTTAAAAT 
AGGAGGGAAG 
GGGAAGGGGT 
ATAGACAGTA 
ATAGGTAAAT 
GTCCTTATAT 
CCTTTGCCAC 
GCCCTTTTCA 
AAGCCCTTAT 
GCCTACATTT 
AATCTTTCTG 
TCTGACCCAT 
TGTTCCCCCA 
GTTTTATATC 
AGTGTAATTA 
AGTGCTAGAC 
AGTCACTTAA 
CAGTTAGAAG 
AACTATGCCT 
AACAAAACCT 
TTCCACTGAA 
CAGTCTATTT 
CTCTCTACCA 
TTTTAACAAC 
GATGTATGGA 



51 



ACCCCAACCC 
AACTTCCTCC 



CGGGGCTGCA 
GCACTGCCCT 
AGGCCATGTA 
GCAAAGTCTT 
TGGTGGTTGG 
GTATGAAGTG 
CGATATTTCT 
TCGTTCAAGA 
CTCTCTTCAC 
GTTCCTGTCC 
CTTCCAGCGG 
AACCGAAAAT 
GTAATCTGAA 
ACTCAGTGCT 
ATTTTACCAT 
GCTCCTTAAA 
AAATACTATT 
GTATTTAATT 
ACATATGTAA 
AAGACCTAGC 
TATACTTATT 
TTGTTTTGTG 
TAGTTTCTAA 
CATGACCAAA 
AGCACTCTTG 
GGTGTTGTAA 
CCCCTAAACT 
TCATGCGTTT 
TTTCTGGAGT 
TCTTTCTACC 
AGGTAGTGTG 
ATGTAGTGTC 
ACACACGTAC 
CAAAACCTAC 
CCACTGAACA 
GTCTATTTCC 
TGCTCTTACT 
AAGGGTGTTG 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
• 660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 



281 



WO 02/086443 

GCACTGGTGT CTGGAGACCT GGATTTGAGT CTTGGTGCTA TCAATCACCG 
AGCAAGGCAT TTGGCTGCTG TAAGCTTATT GCTTCATCTG TAAGCGGTGG 
CTGATCTTCC CACCTCACAG TGATGTTGTG GGGATCCAGT GAGATAGAAT 
GTGGTTTTGT AATTTGAAAA GTGCTATACT AAGGGAAAGA ATTGAGGAAT 
5 CGTTTTGGTG TTGCTTTTCA AATGTTTGAA AATAAAAAAA TGTTAAGAAA 
GCCTTAACCA GTCTCTCAAG TGATGAGACA GTGAAGTAAA ATTGAGTGCA 
AAGATTCTGA GGAAGTCTTA TCTTCTGCAG TGAGTATGGC CCAATGCTTT 
ACAGATGTAA TGGGAAGAAA TAAAAGCCTA CGTGTTGGTA AATCCAACAG 
TTTTGAATCA TAATAACTCA TAAGGTGCTA TCTGTTCAGT GATGCCCTCA 

10 TGTTAGCTGG CAGCTGACGC TGCTAGGATA GTTAGTTTGG AAATGGTACT 
CTACACAAGG AAAGTCAGCC ACCGTGTCTT ATGAGGAATT GGACCTAATA 
TGCCTTCCAA ACCTGAGAAT ATATGCTTTT GGAAGTTAAA ATTTAAATGG 
ATACATAGAT CTTCATGATG TGTGAGTGTA ATTCCATGTG GATATCAGTT 
ACAAAAAAAT TTTATGGCCC AAAATGACCA ACGAAATTGT TACAATAGAA 

15 TTTGATCTTT TTATATTCTT CTACCACACC TGGAAACAGA CCAAT AGAC A 
TTATAATGGG AATTTGTATA AAGCATTACT CTTTTTCAAT AAATTGTTTT 
AAAAGGAAAA AAAAAAAAAA AAA 



PCT/US02/12476 




CTAAACGAAT 
CTGTGGCTAA 
CAAGGGAGAT 
GAGCTCTTGC 
TCATAATAAA 
AATTTTAGTG 
CTTTTGCCAC 
ACCAAACATT 
TTTATCCAAT 

TTAATTTAAA 



2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO: 244 Protein sequence: 
Protein Accession 8: AAD16433.1 

1 11 21 31 41 51 

I I I I I I 

MANAGLQliLG FILAFLGWIG AIVSTALPQW RIYSYAGDNI VTAQAMYEGL WMSCVSQSTG 
QIQCKVFDSL LNLSSTLQAT RALMWGILL GVIAIFVATV GMKCMKCLED DEVQKMRMAV 
IGGAIPLLAG LAILVATAWY GNRIVQEPYD PMTPVNARYB FGQALFTGWA AASLCLLGGA 
LLCCSCPRKT TSYPTPRPYP KPAPSSGKDY V 

Seq ID NO: 245 DNA sequence 

Nucleic Acid Accession #: CAT cluster 



TTAATGGTTA 
AGCATGGTCC 



CAGTTGTTAT 
AGGTGGGGAG 
TTAATAGCCA 
GTCCTACGCC 



11 



AATGCTGTTT 
CGAGAGTCTG 
GAGATTTAGT 
GAAGAATGCA 
GTCGCTCAAG 
CTGCACTTCA 
CACGGAGTCT 



21 
I 

TTTTTCAAGG 
ACCAAGTGAC 
ACAAACCTCA 
TTCTTCATCG 
TATATTAGAA 
CCCAGGAATT 
GCCTGGGCAA 
CGCTGATTGC 



31 
I 

AGAGCACAAG 
CCAGAGGCAG 
GTTCAAATCC 
TTAACAATGA 
TGCCTGTAGT 
CAAAGCTGCA 
TGTAGTAAGA 
TAGCACAGCA 



Seq ID NO: 246 DNA sequence 

Nucleic Acid Accession &: XM_058S53.2 

Coding sequence: 897-1400 



AATTTTCAGA 
TAAATGTATT 
GTGAAACCAT 
CTGTTATCCA 
ATAGAGGGAA 
TGGGATGAGA 
GATCATGTTT 
GTTGAGTGTA 
ACTGGTAACC 
TTATTTCTGT 
TTGTCCAGGC 
GCCTTTGCCT 
TTTTTTTGTT 
TAGTCTTGCT 
CAGCCTCCCA 
AAGAAACTTA 
ACCATCAAAT 
CTGATGTTGC 
CTGAAATTAG 
TCAACCAAAC 
CTTGCGATGA 
GCACAACTCA 
ATAACCTGGC 
ACAATGGAAA 
GTTGCTTCTT 
AACTCCCTGT 
TTTAATGCAA 



11 
I 

AGTTTCGTAT 
TAGTCTCAGT 
TTCTCTTTTA 
TAATATGGAC 
TGAGTATTAA 
GGAGGTGAAA 
AAGAAAAGTC 
TACTGTCTGT 
TGCCTATCTG 
GTTTATGTAT 
CAAGTGCAAT 
CCTGAGTAGC 
TGTTTGTTTG 
TTGTTGCCAG 
GAGTGCTAGG 
CACCGACTCC 
CAGGGCTTGC 
AAGCAAATTG 
TCATCATATC 
CAGGAGCCTT 
AGACTGGGAT 
CTACTCTGAC 
TTCAGGCATG 
TGCACAGTAA 
CTTCTACCAG 
GACTTTCCAA 
GAACCCTCAT 



21 
I 

GGGGATGGTT 
GCTCAATAGA 
ATGTTTCACA 
AGTTCTTGAG 
TTGGAGAAGC 
CCTCACTAGA 
ATGAAAATGG 
CAAAGACTTC 
TATTTTTAAG 
AAGGGGTTTT 
GGCACGAACC 
TGGGACTACA 
TTTGTTTTTG 
GCTAGTCTCA 
ATTACAGCAC 
CTGGACCCTG 
AGGTTTCCTT 
GCTACTTGTC 
TCAAGCTGTG 
AGACAAGAGA 
AAAGATTTGT 
AACAACAGCC 
CGAGTTCCCA 
CTGAATACCT 
TGGGTTCTCA 
ACTGACAAGC 
ACTCAGAAGC 



31 
I 

TTATATAAAT 
AGAGATTTCT 
TTCCTGTTAC 
TCCTAACATT 
TTAAAGTATT 
AAAAGGGACA 
TGAACTAGTG 
CAGCATTTCC 
AACCCAGGAG 
TTGTTTTTTT 
TCATAGCTCC 
GGCATGAGCC 
GGGGGGGTTG 
AACTCCTGGC 
TTGGATTCAG 
AGAAGCTATT 
ATCATCTTAT 
CCTTCAATGC 
ATGACAGAAG 
CTCTGGCTGA 
GGGAGCAGAC 
CTGCGAGCAA 
AATCTCTGCC 
ATCTCATCAA 
TTTTCCTCCT 
ACACTTTTTT 
TTCCAAATAA 



41 
I 

TCAGGTTTTT 
AATAGAAAAG 
AGATTTGTTC 
GAGAGGTTTT 
GCCACTTTAG 
ATGTTAGTGT 
TTTCCAAGCA 
AGGTCCTAGA 
GAAAGCTTTA 
AAAGACAGGA 
TGGACTTAAG 
CCCATGCCTG 
TTTTGTTTTT 
TTCAAGTGAT 
CTTCTTCATT 
GCAATGCCCC 
CAAGTGCAGA 
TCGCCACCAG 
TTGTATTGAG 
GAGCACTTGG 
CAGCACCCCA 
CATAGTTACA 
GTATGTTCTG 
ATGCCAGACC 
AATCTAATTA 
CCTCCCCCCT 
ACCTTTGATA 



51 
I 

CCCACAATAA 
GATTCAAACT 
TCTTGTGACT 
CCCTTAGTGC 
CACTGAAGAT 
GGCCCTTCCT 
TATTGGAAGG 
GAGGAACAAG 
TAATAGAACA 
TCTCACTCCA 
TGATCTGCCT 
GCTAAGTTTG 
TGTAGAGACG 



TCCAACATGG 
TATGACAAAA 
AAGAATCATC 
GTTCCTCGAG 
CAAGATGTTG 
CAGTGCCCTC 
TTTGTCTGGG 
GAACATAAGA 
CCATGGAAAA 
CTAGAAGACT 
TAGAATGGTA 
TGAATCCTCA 
CAGATTG 



Seq ID NO: 247 Protein sequence: 
Protein Accession #: XP_058553.1 



60 
120 
180 



41 


51 




1 

GAACTTTATT 


1 

AATGACTTTC 


60 


CGTGGTTTAG 


TGGTTTCAAC 


120 


TTCTTTTGTC 


TTCACTTAGT 


180 


GGATATTAAT 


ATGTTTCACA 


240 


CTCAGCTACT 


CAGGAGGCTA 


300 


ATGCATTATG 


ATTACAGCTG 


360 


TCCCATCTCT 


GGCTCGGAGG 


420 


GTCTGAGATC 


AAACTGCA 





60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



51 



1 11 21 31 41 

I I I I I I 

MEETYTDSLD PEKLLQCPYD KNHQIRACRF PYHLIKCRKN HPDVASKLAT CPFNARHQVP 
RAEISHHISS CDDRSCIEQD WNQTRSLRQ ETIiAESTWQC PPCDEDWDKD LWEQTSTPFV 
WGTTHYSDNN SPASNIVTEH KNNIASGMRV PKSLPYVLPW KNNGNAQ 



60 
120 



282 



WO 02/086443 

Seq ID NO; 248 UNA sequence 
Nucleic Acid Accession 8: NMJ)03392 
Coding sequence! 758.. 1855 



PCT/US02/12476 



5 



ii 



21 



31 



41 



51 



TTAA6GAAAT CCGGGCTGCT CTTCCCCATC TGGAAGTGGC TTTCCCCACA TCGGCTCGTA 60 

AACTGATTAT GAAACATACG ATGTTAATTC GGAGCTGCAT TTCCCAGCTG GGCACTCTCG 120 

10 CGCGCTGGTC. CCCGGGGCCT CGCCCCCCAC CCCCTGCCCT TCCCTCCCGC GTCCTGCCCC 180 

CATCCTCCAC CCCCCGCGCT GGCCACCCCG CCTCCTTGGC AGCCTCTG6C GGCAGCGCGC 240 

TCCACTCGCC TCCCGTGCTC CTCTCGCCCA TGGAATTAAT TCTGGCTCCA CTTGTTGCTC 300 

GGCCCAGGTT GGGGAGAGGA CGGAGGGTGG CCGCAGCGGG TTCCTGAGTG AATTACCCAG 360 

GAGGGACTGA GCACAGCACC AACTAGAGAG GGGTCAGGGG GTGCGGGACT C GAGCG AGCA 420 

15 GGAAGGAGGC AGCGCCTGGC ACCAGGGCTT TGACTCAACA GAATTGAGAC ACGTTTGTAA 480 

TCGCTGGCGT GCCCCGCGCA CAGGATCCCA GCGAAAATCA GATTTCCTGG TGAGGTTGCG 540 

TGGGTGGATT AATTTGGAAA AAGAAACTGC CTATATCTTG CCATCAAAAA AC TCACG GAG 600 

GAGAAGCGCA GTCAATCAAC AGTAAACTTA AGAGACCCCC GATGCTCCCC TGGTTTAACT 660 

TGTATGCTTG AAAATTATCT GAGAGGGAAT AAACATCTTT TCCTTCTTCC CTCTCCAGAA 720 

20 GTCCATTGGA ATATTAAGCC CAGGAGTTGC TTTGGGGATG GCTGGAAGTG CAATGTCTTC 780 

CAAGTTCTTC CTAGTGGCTT TGGCCATATT TTTCTCCTTC GCCCAGGTTG TAATTGAAGC 840 

CAATTCTTGG TGGTCGCTAG GTATGAATAA CCCTGTTCAG ATGTCAGAAG TATATATTAT 900 

AGGAGCACAG CCTCTCTGCA GCCAACTGGC AGGACTTTCT CAAGGACAGA AGAAACTGTG 960 

CCACTTGTAT CAGGACCACA TGCAGTACAT CGGAGAAGGC GCGAAGACAG GCATCAAAGA 1020 

25 ATGCCAGTAT CAATTCCGAC ATCGACGGTG GAACTGCAGC ACTGTGGATA ACACCTCTGT 1080 

TTTTGGCAGG GTGATGCAGA TAGGCAGCCG CGAGACGGCC TTCACATACG CCGTGAGCGC 1140 

AGCAGGGGTG GTGAACGCCA TGAGCCGGGC GTGCCGCGAG GGCGAGCTGT CCACCTGCGG 1200 

CTGCAGCCGC GCCGCGCGCC CCAAGGACCT GCCGCGGGAG TGGCTCTGGG GCGGCTGCGG 1260 

CGACAACATC GACTATGGCT ACCGCTTTGC CAAGGAGTTC GTGGACGCCC GCGAGCGGGA 1320 

30 GCGCATCCAC GCCAAGGGCT CCTACGAGAG TGCTCGCATC CTCATGAACC TGCACAACAA 1380 

CGAGGCCGGC CGCAGGACGG TGTACAACCT GGCTGATGTG GCCTGCAAGT GCCATGGGGT 1440 

GTCCGGCTCA TGTAGCCTGA AGACATGCTG GCTGCAGCTG GCAGACTTCC GCAAGGTGGG 1500 

TGATGCCCTG AAGGAGAAGT ACGACAGCGC GGCGGCCATG CGGCTCAACA GCCGGGGCAA 1560 

GTTGGTACAG GTCAACAGCC GCTTCAACTC GCCCACCACA CAAGACCTGG TCTACATCGA 1620 

35 CCCCAGCCCT GACTACTGCG TGCGCAATGA GAGCACCGGC TCGCTGGGCA CGCAGGGCCG 1680 

CCTGTGCAAC AAGACGTCGG AGGGCATGGA TGGCTGCGAG CTCATGTGCT GCGGCCGTGG 1740 

GTACGACCAG TTCAAGACCG TGCAGACGGA GCGCTGCCAC TGCAAGTTCC ACTGGTGCTG 1800 

CTACGTCAAG TGCAAGAAGT GCACGGAGAT CGTGGACCAG TTTGTGTGCA AGTAGTGGGT 1860 

GCCACCCAGC ACTCAGCCCC GCTCCCAGGA CCCGCTTATT TATAGAAAGT ACAGTGATTC 1920 

40 TGGTTTTTGG TTTTTAGAAA TATTTTTTAT TTTTCCCCAA GAATTGCAAC CGGAACCATT 1980 

XTTTTTCCTG TTACCATCTA AGAACTCTGT GGTTTATTAT TAATATTATA ATTATTATTT 2040 

GGCAATAATG GGGGTGGGAA CCACGAAAAA TATTTATTTT GTGGATCTTT GAAAAGGTAA 2100 

TACAAGACTT CTTTTGGATA GTATAGAATG AAGGGGGAAA TAACACATAC CCTAACTTAG 2160 

CTGTGTGGGA CATGGTACAC ATCCAGAAGG TAAAGAAATA CATTTTCTTT TTCTCAAATA 2220 

45 TGCCATCATA TGGGATGGGT AGGTTCCAGT TGAAAGAGGG TGGTAGAAAT CTATTCACAA 2280 

TTCAGCTTCT ATGACCAAAA TGAGTTGTAA ATTCTCTGGT GCAAGATAAA AGGTCTTGGG 2340 

AAAACAAAAC AAAACAAAAC AAACCTCCCT TCCCCAGCAG GGCTGCTAGC TTGCTTTCTG 2400 

CATTTTCAAA ATGATAATTT ACAATGGAAG GACAAGAATG TCATATTCTC AAGGAAAAAA 2460 

GGTATATCAC ATGTCTCATT CTCCTCAAAT ATTCCATTTG CAGACAGACC GTCATATTCT 2520 

50 AATAGCTCAT GAAATTTGGG CAGCAGGGAG GAAAGTCCCC AGAAATTAAA AAATTTAAAA 2580 

CTCTTATGTC AAGATGTTGA TTTGAAGCTG TTATAAGAAT TGGGATTCCA GATTTGTAAA 2640 

AAGACCCCCA ATGATTCTGG ACACTAGATT TTTTGTTTGG GGAGGTTGGC TTGAACATAA 2700 

ATGAAATATC CTGTATTTTC TTAGGGATAC TTGGTTAGTA AATTATAATA GTAGAAATAA 2760 

TACATGAATC CCATTCACAG GTTTCTCAGC CCAAGCAACA AGGTAATTGC GTGCCATTCA 2820 

55 GCACTGCACC AGAGCAGACA ACCTATTTGA GGAAAAACAG TGAAATCCAC CTTCCTCTTC 2880 

ACACTGAGCC CTCTCTGATT CCTCCGTGTT GTGATGTGAT GCTGGCCACG TTTCCAAACG 2940 

GCAGCTCCAC TGGGTCCCCT TTGGTTGTAG GACAGGAAAT GAAACATTAG GAGCTCTGCT 3000 

TGGAAAACAG TTCACTACTT AGGGATTTTT GTTTCCTAAA ACTTTTATTT TGAGGAGCAG 3060 

TAGTTTTCTA TGTTTTAATG ACAGAACTTG GCTAATGGAA TTCACAGAGG TGTTGCAGCG 3120 

60 TATCACTGTT ATGATCCTGT GTTTAGATTA TCCACTCATG CTTCTCCTAT TGTACTGCAG 3180 

GTGTACCTTA AAACTGTTCC CAGTGTACTT GAACAGTTGC ATTTATAAGG GGGGAAATGT 3240 

GGTTTAATGG TGCCTGATAT CTCAAAGTCT TTTGTACATA ACATATATAT ATATATACAT 3300 

ATATATAAAT ATAAATATAA ATATATCTCA TTGCAGCCAG TGATTTAGAT TTACAGCTTA 3360 

CTCTGGGGTT ATCTCTCTGT CTAGAGCATT GTTGTCCTTC ACTGCAGTCC AGTTGGGATT 3420 

65 ATTCCAAAAG TTTTTTGAGT CTTGAGCTTG GGCTGTGGCC CCGCTGTGAT CATACCCTGA 3480 

GCACGACGAA GCAACCTCGT TTCTGAGGAA GAAGCTTGAG TTCTGACTCA CTGAAATGCG 3540 

TGTTGGGTTG AAGATATCTT TTTTTCTTTT CTGCCTCACC CCTTTGTCTC CAACCTCCAT 3600 

TTCTGTTCAC TTTGTGGAGA GGGCATTACT TGTTCGTTAT AGACATGGAC GTTAAGAGAT 3660 

ATTCAAAACT CAGAAGCATC AGCAATGTTT CTCTTTTCTT AGTTCATTCT GCAGAATGGA 3720 

70 AACCCATGCC TATTAGAAAT GACAGTACTT ATTAATTGAG TCCCTAAGGA A TATT CAGCC 3780 

CACTACATAG ATAGCTTTTT ITfTTTTTTT TTTTTTTTAA TAAGGACACC TCTTTCCAAA 3840 

CAGGCCATCA AATATGTTCT TATCTCAGAC TTACGTTGTT TTAAAAGTTT GGAAAGATAC 3900 

ACATCTTTTC ATACCCCCCC TTAGGAGGTT GGGCTTTCAT ATCACCTCAG CCAACTGTGG 3960 

CTCTTAATTT ATTGCATAAT GATATCCACA TCAGCCAACT GTGGCTCTTT AATTTATTGC 4020 

75 ATAATGATAT TCACATCCCC TCAGTTGCAG TGAATTGTGA GCAAAAGATC TTGAAAGCAA 4080 

AAAGCACTAA TTAGTTTAAA ATGTCACTTT TTTGGTTTTT ATTATACAAA AACCATGAAG 4140 

. TACTTTTTTT ATTTGCTAAA TCAGATTGTT CCTTTTTAGT GACTCATGTT TATGAAGAGA 4200 

GTTGAGTTTA ACAATCCTAG CTTTTAAAAG AAACTATTTA ATGTAAAATA TTCTACATGT 4260 

CATTCAGATA TTATGTATAT CTTCTAGCCT TTATTCTGTA CTTTTAATGT ACATATTTCT 4320 

80 GTCTTGCGTG ATTTGTATAT TTCACTGGTT TAAAAAACAA ACATCGAAAG GCTTATTCCA 4380 
AATGGAAGAT AGAATATAAA ATAAAACGTT ACTTGTAAAA AAAAAAAA 

Seq ID NO: 249 Protein sequence; 
Protein Accession lh NP_003383 



85 



i 



ii 



21 



31 



41 



51 



283 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

MAGSAMSSKF FLVAIAIFFS FAQWIEANS WWSLGMNNPV 
SQGQKKLCHL YQDHMQYIGB GAKTGXKECQ YQPRHRRWNC 
AFTYAVSAAG WNAMSRACR EGELSTCGCS RAARPKDLPR 
FVDAREREHI HAKGSYESAR ILMNLHNNEA GRRTVYNLAD 
liADFRKVGDA LKEKYDSAAA MRLNSRGKLV QVNSRFNSPT 
GSLGTQGRLC NKTSEGHDGC ELMCCGRGYD QFKTVQTERC 
QFVCK 

Seq IS NO: 250 DMA sequence 
Nucleic Acid Accession if: NM_014058 
Coding sequencet S6..1324 



PCT/US02/12476 



QMSEVYIIGA 
STVDNTSVPG 
DWLWGGCGDN 
VACKCHGVSG 
TQDLVYIDPS 
HCKFHWCCYV 



QPLCSQLAGL 
RVMQIGSRBT 
IDYGYRFAKE 
SCSLKTCWLQ 
PDYCVRNEST 
KCKKCTEIVD 



TGACTTGGAT 
TCGGCCAGAT 
CGTCATCTTC 
GAGATATAAT 
ACTATATGCT 
TGAATCAATG 
TCAGGTTATC 
TAGATTTCAC 
TGAAAAGCTG 
AAAAATCAAC 
TAAAACTCTA 
GCCCTGGCAG 
TGCCACATGG 
GACTGCTTCC 
AATTGTCCAT 
TTCTAGCCCT 
TGAGTTTCAA 
TTACAGTCAA 
TGAACCTCAA 
AGGAAAAACA 
AGATATCTGG 
GCCTGGTGTT 
CTAAGAGAGA 
CCATTTTTAG 
AATAAACTGT 



11 
I 

GTAGACCTCG 
GTGGTGAGGG 
ATATCCCTGA 
CAAAAGAAGA 
GAGTTTGGCA 
GTGAAAAATG 
AAGTTCAGTC 
TCTACTGAGG 
CAAGATGCTG 
AAGACAGAAA 
GGTCAGAGTC 
GCTAGCCTGC 
CTTGTGAGTG 
TTTGGAGTAA 
GAAAAATACA 
GTTCCCTACA 
CCAGGTGATG 
AATCATCTTC 
GCTTACAATG 
GATGCATGCC 
TACCTTGCTG 
TATACTAGAG 
AAAGCCTCAT 
AGATACAGAA 
TTGCTTGATG 



21 
I 

ACCTTCACAG 
CTAGGAAAAG 
TTGTCCTGGC 
CCTACAATTA 
GAGAGGCTTC 
CATTTTATAA 
AACAGAAGCA 
ATCCTGAAAC 
TAGGACCCCC 
CAGACAGCTA 
TCAGGATCGT 
AGTGGGATGG 
CTGCTCACTG 
CAATAAAACC 
AACACCCATC 
CAAATGCAGT 
TGATGTTTGT 
GACAAGCACA 
ACGCCATAAC 
AGGGTGACTC 
GAATAGTGAG 
TTACGGCCTT 
GGAACAGATA 
TTGGAGAAGA 
CAAAAAAAAA 



31 
I 

GACTCTTCAT 
AGTTTGTTGG 
AGTGTGCATT 
CTATAGCACA 
TAACAATTTT 
ATCTCCATTA 
TGGAGTGTTG 
TGTAGATAAA 
TAAAGTAGAT 
TCTAAACCAT 
TGGTGGGACA 
GAGTCATCGC 
TTTTACAACA 
TTCGAAAATG 
ACATGACTAT 
ACATAGAGTT 
GACAGGATTT 
GGTGACTCTC 
TCCTAGAATG 
TGGAGGACCA 
CTGGGGAGAT 
GCGGGACTGG 
ACATTTTTTT 
CTTGCAAAAC 
A 



41 
I 

TGCTGGTTGG 
GAACCCTGGG 
GGACTCACTG 
TTGTCATTTA 
ACAGAAATGA 
AGGGAAGAAT 
GCTCATATGC 
ATTGTTCAAC 
CCTCACTCAG 
TGCTGCGGAA 
GAAGTAGAAG 
TGTGGAGCAA 
TATAAGAACC 
AAACGGGGTC 
GATATTTCTC 
TGTCTCCCTG 
GGAGCACTGA 
ATAGACGCTA 
TTATGTGCTG 
CTGGTTAGTT 
GAATGTGCGA 
ATTACTTCAA 
TTGTTTTTTG 
AGCTAGATTT 



SI 
I 

CAATGATGTA 
TTATCGGCCT 
TTCATTATGT 
CAACTGACAA 
GCCAGAGACT 
TTGTCAAGTC 
TGTTGATTTG 
TTGTTTTACA 
TTAAAATTAA 
CACGAAGAAG 
AGGGTGAATG 
CCTTAATTAA 
CTGCCAGATG 
TCCGGAGAAT 
TTGCAGAGCT 
ATGCATCCTA 
AAAATGATGG 
CAACTTGCAA 
GCTCCTTAGA 
CAGATGCTAG 
AACCCAACAA 
AAACTGGTAT 
GGTGTGGAGG 
GACTGATCTC 



Seq ID NO: 251 Protein sequence: 
Protein Accession #: NP 054777 



MYRPDWRAR 
DKLYAEFGRE 
ICRFHSTEDP 
RSKTLGQSLR 
RWTASFGVTI 
SYEFQPGDVM 
LEGKTDACQG 
GI 



11 
I 

KRVCWEPWVI 
ASNNFTEMSQ 
ETVDKIVQLV 
IVGGTEVEEG 
KPSKMKRGLR 
FVTGFGALKN 
DSGGPLVSSD 



21 
I 

GLVIFISLIV 
RLESMVKNAF 
LHEKLQDAVG 
EWPWQASLQW 
RIIVHEKYKH 
DGYSQNHLRQ 
ARDIWYLAGI 



31 
I 

LAVCIGLTVH 
YKSPLREEFV 
PPKVDPHSVK 
DGSHRCGATL 
PSHDYDISLA 
AQVTLIDATT 
VSWGDECAKP 



41 

I 

YVRYNQKKTY 
KSQVIKFSQQ 
IKKINK.TETD 
INATWLVSAA 
ELSSPVPYTN 
CNEPQAYNDA 
NKPGVYTRVT 



51 
I 

NYYSTLSFTT 
KHGVLAHMIiL 
SYIiNHCCGTR 
HCFTTYKNPA 
AVHRVCLPDA 
ITPRMLCAGS 
ALRDWITSKT 



Seq ID NO: 252 DNA sequence 

Nucleic Acid Accession th NM_003504.2 

Coding sequence: 71-1771 



GGCACGAGGC 
CGCCGTGGCT 
GAGGGTCCTT 
GGCCTTGTTC 
ACTTGAAACT 
TGGAGCTAAT 
GTGTGACACC 
ACTCATTAAA 
AGAGGAGGAT 
CACACGGTTA 
GGAGGCCCGG 
GTCAGCCATG 
GTGGTGGGCC 
ATACGTGACT 
GGATGAGGAG 
CCTGGTGCTC 
AGCCAGGTTC 
CATGGGTCTT 
GGAGAATTTG 
CGTGCAGACT 
CTTTGCCACC 
CATCCAGGCT 
ACTCGCCAAG 
CCTCGTCATC 
CATGCTGTTC 
TGTGTGTTCG 



11 
I 

CTCGTGCCGC 
ATGTTjCGTGT 
CTCTTCGTGG 
CAGTGTGACC 
GCATTTCTTG 
GTAGACCTAT 
CATAGGCCAG 
CAAGATGATG 
GAAGAGCATT 
GAAGAGGAGA 
AGAAGAGACA 
GTGATGTTTG 
ATCGTTGGAC 
GATGTTGGTG 
AACACACTCT 
TACCAGCACT 
AAGCTGTGGT 
CCCCTGAAGC 
CGGGAAATGA 
TTCAGCATTC 
ATGTCTTTGA 
CTGGACAGCC 
AAGCAGCTGC 
TCCCAGGGGC 
TCTAGGCCGG 
ACAAAGAACC 



21 

I 

CGGGCTCTTG 
CCGATTTCCG 
CCTCGGACGT 
ACGTGCAATA 
AGCATAAAGA 
TGGATATTCT 
TCAATGTCGT 
ACCTTGAAGT 
CAGGAAATGA 
TAGTGGAGCA 
TCCTCTTTGA 
AGCTGGCTTG 
TAACAGACCA 
TCCTGCAGCG 
CCGTGGACTG 
GGTCCCTCCA 
CTGTGCATGG 
AGGTGAAGCA 
TTGAAGAGTC 
ATTTTGGGTT 
TGGAGAGCCC 
TCTCCAGGAG 
GAGCCACCCA 
CTTTCCTGTA 
CATCCCTAAG 
GGCGCTGCAA 



31 
I 

GTACCTCAGC 
CAAAGAGTTC 
GGATGCTCTG 
TACGCTGGTT 
ACAGTTTCAT 
TCAACCTGAT 
CAATGTATAC 
TCCCGCCTAT 
CAGTGATGGG 
AACCATGCGG 
CTACGAGCAG 
GATGCTGTCC 
GTGGGTGCAA 
CCACGTTTCC 
CACACGGATC 
TGACAGCCTG 
ACAGAAGCGG 
GAAGTTCCAG 
TGCAAATAAA 
CAAGCACAAG 
CGAGAAGGAT 
TAACCTGGAC 
GCAGACCATT 
CTGCTCTCTC 
CCTGCTCAGC 
ACTGCTGCCC 



41 
I 

GCGAGCGCCA 
TACGAGGTGG 
TGTGCGTGCA 
CCAGTTTCTG 
TATTTTATTC 
GAAGACACTA 
AACGATACCC 
GAAGACATCT 
TCAGAGCCTT 
AGGAGGCAGC 
TATGAATATC 
AAGGACCTGA 
GACAAGATCA 
CGCCACAACC 
TCCTTTGAGT 
TGCAACACCA 
CTCCAGGAGT 
GCCATGGACA 
TTTGGGATGA 
TTTCTGGCCA 
GGCTCAGGGA 
AAGCTGTACC 
GCCAGCTGCC 
ATGGAGGGCA 
AAACACCTGC 
CTGGTGATGG 



60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 



60 
120 
180 
240 
300 
360 
420 



51 
I 

GGCGTCCGGC 
TCCAGAGCCA 
AGATCCTTCA 
GGTGGCAAGA 
TCATAAACTG 
TATTCTTTGT 
AGATCAAATT 
TCAGGGATGA 
CTGAGAAGCG 
GGCGAGAGTG 
ATGGGACATC 
ATGACATGCT 
CTCAAATGAA 
ACCGGAACGA 
ATGACCTCCG 
GCTATACCGC 
TCCTTGCAGA 
TCTCCTTGAA 
AGGACATGCG 
GCGACGTGGT 
CAGATCACTT 
ATGGCCTGGA 
TTTGCACCAA 
CTCCAGATGT 
TCAAGTCCTT 
CTGCCCCCCT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



284 



10 



15 



20 



25 



WO 02/086443 

GAGCATGGAG CATGGCACAG TGACCGTGGT GGGCATCCCC CCAGAGACCG ACAGCTCGGA 1620 
CAGGAAGAAC TTTTTTGGGA GGGCGTTTGA GAAGGCAGOG GAAAGCACCA GCTCCCGGAT 1680 
GCTGCACAAC CATTTTGACC TCTCAGTAAT TGAGCTGAAA GCTGAGGATC GGAGCAAGTT 1740 
TCTGGACGCA CTTATTTCCC TCCTGTCCTA GGAATTTGAT TCTTCCAGAA TGACCTTCTT 1800 
ATTTATGTAA CTGGCTTTCA TTTAGATTGT AAGTTATGGA CATGATTTGA GATGTAGAAG 1860 
CCATTTTTTA TTAAATAAAA TGCTTATTTT AGGCTCOGTC CCCAAAAAAA AAAAAAAAAA 1920 
AAAAAAAAAA AA 

Seg ID NO: 253 Protein sequence: 
Protein Accession ft: NP_003495.1 



PCT/US02/12476 



MFVSDFRKEF 
AFLEHKEQFH 
QDDDLEVPAY 
RRDILFDYEQ 
DVGVLQRHVS 
KLWSVHGQKR 
FSIHFGFKHK 
KQLRATQQTI 
TKNRRCKLLP 
HFDLSVIELK 



11 
I 

YEWQSQRVL 
YFILINCGAN 
EDIFRDEEED 
YEYHGTSSAM 



LQEFLADMGL 
FLASDWFAT 
ASCLCTNLVI 
LVMAAPLSME 
AEDRSKFLDA 



21 
I 

LFVASDVDAL 
VDLLDILQPD 
EEHSGNDSDG 
VMFELAWMLS 
NTLSVDCTRI 
PLKQVKQKFQ 
MSLMESPEKD 
SQGPFLYCSL 
HGTVTWGIP 
LISLLS 



31 
I 

CACKILQALF 
EDTIFFVCDT 
SEPSEKRTRL 
KDLNDMLWWA 
SFEYDLRLVL 
AMDISLKENL 
GSGTDHFIQA 
MEGTPDVMLF 
PETDSSDRKN 



41 

I 

QCDHVQYTLV 
HRPVNWNVY 
EEEIVEQTMR 
IVGliTDQWVQ 
YQHWSLHDSL 
REMIEESANK 
LDSLSRSNLD 
SRPASLSLIiS 
FFGRAFEKAA 



51 
I 

PVSGWQELET 
NDTQIKLLIK 
RRQRREWEAR 
DKITQMKYVT 
CNTSYTAARF 
FGMKDMRVQT 
KLYHG LELAK 
KHLIiKSFVCS 
ESTSSRMLHH 



Seq ID NO: 254 DNA sequence 
Nucleic Acid Accession #j NM_022337 
Coding sequence: 48.. 683 



60 
120 
180 
240 
300 
360 
420 
480 
540 



30 
35 
40 
45 
50 
55 
60 
65 
70 
75. 
80 
85 



GGCTGCGCTT 
ACAAGGAGCA 
TCATCAAGCG 
ACTTCGCGCT 
ATATCGCAGG 
GTGCATTTAT 
AAAATGATTT 
TGGCCAACAA 
AGTTCTGCAA 
ACATTGATGA 
TGGAGTCTAT 
GCTCTGGCTG 
TTGTTCCACA 
CACATGTGGC 
GTTCTTTCTA 
TCTGTTACAA 
TTATTTGCTT 
AACTAGCTGT 
AATATATTCT 
GACCTCCATT 
ACAGGTGTGC 
AACTGAATAT 
CTCAAGCTGT 
GCAAGTGAAC 



11 
I 

CCCTGGTCAG 
CCTGTACAAG 
CTACGTGCAC 
CAAGGTGCTC 
TCAAGAAAGA 
TGTCTTCGAT 
GGACTCCAAG 
ATGTGACCAG 
GGAGCACGGT 
AGCCTCCAGA 
TGAGCCGGAC 
TGCCAAATCC 
AATTGTGCCT 
AAGCCAAAGA 
TGCTTTCCTC 
ACTTCTGTCA 
CTTTTAATCA 
CAAGTCAAGG 
CTGATGGCCT 
CTCGGCAGAC 
TATATTGTCC 
TGTATGAAAA 
GGGGCTCCTC 
AATAAAACAT 



21 
I 

GCACGGCACG 
TTGCTGGTGA 
CAGAACTTCT 
CACTGGGACC 
TTTGGAAACA 
GTCACCAGGC 
TTAAGTCTCC 
GGGAAGGATG 
TTCGTAGGAT 
TGCCTGGTGA 
GTCGTGAAGC 
TAGTAGGCAC 
CTATTTTTAC 
TCTATGCCTC 
ACCATCATCA 
TGTAGCTGAC 
GCAAAGGCCT 
ACTGGCTTTC 
GACAGGCCTA 
CTAAGAGTTG 
TTGTCCTAAC 
GACATGCCTC 
TATACATGCT 
TAAAAGATAA 



31 
I 

TCTGGCCGGC 
TTGGCGACCT 
CCTCGCACTA 
CGGAGACTGT 
TGACGAGGGT 
CAGCCACATT 
CTAATGGCAA 
TGCTCATGAA 
GGTTTGAAAC 
AACACATACT 
CCCATCTCAC 
CTTTGCTG6T 
CATTTTGGGT 
TGTTTTTTCA 
CAGTGTTTAC 
CAAAATCCTG 
CAAGTCTTAA 
ACCTTGCCCT 
TTAAGTAGAT 
CCTCTGAGTT 
TGTCACTTGC 
CATATGTGCC 
ATACATGTAA 
AA 



41 
I 

CGCCAGGATG 
GGGCGTGGGG 
CCGGGCCACA 
GGTGCGCCTG 
CTATTACCGA 
TGAAGCAGTG 
ACCGGTTTCA 
CAATGGCCTC 
ATCAGCAAAG 
TGCAAATGAG 
ATCAACCAAG 
GTCTGGTAGG 
AAACGTCAGG 
ATGAGAGAGA 
AAACTTTTGA 
CAGGGCCACA 
AATAAAAGGG 
GGTGTCTTTT 
GTGATATTTT 
AGCTCTTTGG 
CATGGCCTGA 
TTTCTGTTAG 
TATATATTAT 



51 
I 

CAGGCCCCGC 
AAGACCAGTA 
ATCGGCGTGG 
CAGCTCTGGG 
GAAGCTATGG 
GCAAAGTGGA 
GTGGTTTTGT 
AAGATGGACC 
GAAAATATAA 
TGTGACCTAA 
GTTGCCAGCT 
AATGACCTCA 
ATAGATATAC 
AATAGCAAAT 
AAATATTTAG 
GTCGGCACTG 
GAGAAGAACA 
TCCAGATTTC 
CTTCCAAGAT 
AATCGTGAAC 
ATGTTGGCTT 
CTCTCTTTGA 
ATATATTTTT 



Seq ID NO: 255 Protein sequence: 
Protein Accession Jh NP_07i732 



11 



21 



31 



41 



MQAPHKEHLY KLLVIGDLGV GKTSIIKRYV HQNFSSHYRA TIGVDFALKV LHWDPETWR 
LQLWDIAGQE RFGNMTRVYY REAMGAFIVF DVTRPATFEA VAKWKNDLDS KLSLPNGKPV 
SWLLANKCD QGKDVLMNNG LKMDQFCKEH GFVGWFETSA KENINIDEAS RCLVKHILAN 
ECDLMESIEP DWKPHLTST KVASCSGCAK S 

Seq ID NO: 256 DNA sequence 
Nucleic Acid Accession #: NMJ)16321 
Coding sequence: 25.. 1464 



1 
I 

GGAACCGCCC 
CCGCTGGCGG 
GGTGTTCGTG 
GAACTTGAGC 
CGTGATGGTC 
CGCCGTGGGC 
GGGCTGGTTC 
CGCTGACTTC 
CCCCATTCAG 
CATTCTCCTT 
TGGCGCCTAC 
CAAGGAGAGA 
CCTGTGGATG 
CCGAGCCGCC 



11 
I 

GCTGCCAGCC 
CTGCCGCTCA 
CGCTACGACT 
GACATGGAGA 
TTCGTGGGCT 
TTCAACTTCC 
CACTTCTTAC 
TGCGTGGCCT 
CTGCTCATCA 
AACCTGCTAA 
TTTGGGCTCA 
CAGAATTCTG 
TACTGGCCCA 
ATCAACACCT 



21 
I 

CGGCCAGGCA 
CCTGCCTGCT 
TCGAGGCCGA 
ACGAATTCTA 
TCGGCTTCCT 
TGTTGGCAGC 
AAGACCGCTA 
CTGTCTGCGT 
TGACTTTCTT 
AGGTGAAGGA 
CAGTGACCCG 
TGTACCAGTC 
GCTTCAACTC 
ACTGCTCCTT 



31 

I 

CCCCTGCAGC 
CCTGCAGGTG 
CGCCCACTGG 
CTATCGCTAC 
CATGACTTTC 
CTTCGGCATC 
CATCGTCGTG 
GGCCTTTGGG 
CCAAGTGACC 
TGCAGGAGGC 
GATCCTCTAC 
GGACCTCTTT 
AGCCATATCC 
GGCAGCCTGC 



41 
I 

ATGGCCTGGA 
ATTATGGTGA 
TGGTCAGAGA 
CCAAGCTTCC 
CTGCAGCGCT 
CAGTGGGCGC 
GGCGTGGAGA 
GCAGTTCTGG 
CTCTTCGCTG 
TCCATGACCA 
CGACGCAACC 
GCCATGATTG 
TACCATGGGG 
GTGCTTACCT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



60 
120 
180 



51 
1 

ACACCAACCT 
TTCTCTTCGG 
GGACGCACAA 
AGGACGTGCA 
ACGGCTTCAG 
TGCTCATGCA 
ACCTCATCAA 
GTAAAGTCAG 
TGAATGAGTT 
TCCACACATT 
TAGAGCAGAG 
GCACCCTCTT 
ACAGCCAGCA 
CGGTGGCAAT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



285 



WO 02/086443 

ATCCAGTGCC CTGCACAAGA AGGGCAAGCT GGACATGGTG CACATCCAGA ATGCCACGCT 900 

CGCAGGAGGG GTGGCCGTGG GTACCGCTGC TGAGATGATG CTCATGCCTT ACGGTGCCCT 960 

CATCATCGGC TTCGTCTGCG GCATCATCTC CACCCTGGGT TTTGTATACC TGACCCCATT 1020 

CCTGGAGTCC CGGCTGCACA TCCAGGACAC ATGTGGCATT AACAATCTGC ATGGCATTCC 1080 

TGGCATCATA GGCGGCATCG TGGGTGCTGT GACAGCGGCC TCCGCCAGCC TTGAAGTCTA 1140 

TGGAAAAGAA GGGCTTGTCC ATTCCTTTGA CTTTCAAGGT TTCAACGGGG ACTGGACCGC 1200 

AAGAACACAG GGAAAGTTCC AGATTTATGG TCTCTTGGTG ACCCTGGCCA TGGCCCTGAT 1260 

GGGTGGCATC ATTGTGGGGC TCATTTTGAG ATTACCATTC TGGGGACAAC CTTCAGATGA 1320 

GAACTGCTTT GAGGATGCGG TCTACTGGGA GATGCCTGAA GGGAACAGCA CTGTCTACAT 1380 

CCCTGAGGAC CCCACCTTCA AGCCCTCAGG ACCCTCAGTA CCCTCAGTAC CCATGGTGTC 1440 

CCCACTACCC ATGGCTTCCT CGGTACCCTT GGTACCCTAG GCTCCCAGGG CAGGTGAGGA 1500 

GCAGGCTCCA CAGACTSTCC TGGGGCCCAG AGGAGCTGGT GCTGACCTAG CTAGGGATGC 1560 

AAGAGTGAGC AAGCAGCACC CCCACCTGCT GGCTTGGCCT CAAGGTGCCT CCACCCCTGC 1620 

CCTCCCCTTC ATCCCAGGGG GTCTGMCTGA GAATGGAGAA GGAGAAGCTA CAAAGTGGGC 1680 

ATCCAAGCCG GGTTCTGGCT GCAGAAGTTC TGCCTCTGCC TGGGGTCTTG GCCACATTGG 1740 

AGAAAAACAG GCTCAAAGTG GGGCTGGGAC CTGGTGGGTG AACCTGAGCT CTCCCAGGAG 1800 

ACAACTTAGC TGCCAGTCAC CACCTATGAG GCTCTTCTAC CCCGTGCCTG CACCTCGGCC 1860 

AGCATCTCCT ATGCTCCCTG GGTCCCCCAG ACCTCTCTGT GTTGTGTGCG TGGCAGCCTC 1920 
CAGGAATAAA CATTCTTGTT GTCCTTTGTA AAAAAAAAAA AAAAAAAA 



Seq ID NO: 257 Protein sequence: 
Protein Accession #: NP_057405 

1 ii 21 31 41 51 

MAWNTNLRWR LPLTCLLLQV IMVILFGVFV RYDFEADAHW WSERTHKNLS DMENEFYYRY 60 

PSFQDVHVMV FVGFGFLMTF LQRYGFSAVG FNFLLAAFGI QWALLMQGWF HFLQDRYIW 120 

GVENLINADF CVASVCVAFG AVLGKVSPIQ LLIMTFFQVT LFAVNEFILIi NLLKVKDAGG 180 

SMTIHTFGAY FGLTVTRILY RRNLEQSKER QNSVYQSDLF AMIGTLFLWM YWPSFNSAIS 240 

YHGDSQHRAA INTYCSLAAC VLTSVAISSA LHKKGKLDMV HIQNATLAGG VAVGTAAEMM 300 

LMPYGALIIG FVCGIISTLG FVYLTPFLES RLHIQDTCGI NNLHGIPGI I GGIVGAVTAA 360 

SASLEVYGKE GLVHSFDFQG FNGDWTARTQ GKFQIYGLLV TLAMALMGGI IVGLILRLPF 420 
WGQPSDENCF EDAVYWEMPE GNSTVYIPED PTFKPSGPSV PSVPMVSPLP MASSVPLVP 

Seq ID NO: 258 DNA sequence 

Nucleic Acid Accession &: NM_002358.2 

Coding sequence: 7 5.. 6 92 



1 11 21 31 41 51 

I I I I I I 

GGGAAGTGCT GTTGGAGCCG CTGTGGTTGC TGTCCGCGGA GTGGAAGCGC GTGCTTTTGT 60 

TTGTGTCCCT GGCCATGGCG CTGCAGCTCT CCCGGGAGCA GGGAATCACC CTGCGCGGGA 120 

GCGCCGAAAT CGTGGCCGAG TTCTTCTCAT TCGGCATCAA CAGCATTTTA TATCAGCGTG 180 

GCATATATCC ATCTGAAACC TTTACTCGAG TGCAGAAATA CGGACTCACC TTGCTTGTAA 240 

CTACTGATCT TGAGCTCATA AAATACCTAA ATAATGTGGT GGAACAACTG AAAGATTGGT 300 

TATACAAGTG TTCAGTTCAG AAACTGGTTG TAGTTATCTC AAATATTGAA AGTGGTGAGG 360 

TCCTGGAAAG ATGGCAGTTT GATATTGAGT GTGACAAGAC TGCAAAAGAT GACAGTGCAC 420 

CCAGAGAAAA GTCTCAGAAA GCTATCCAGG ATGAAATCCG TTCAGTGATC AGACAGATCA 480 

CAGCTACGGT GACATTTCTG CCACTGTTGG AAGTTTCTTG TTCATTTGAT CTGCTGATTT 540 

ATACAGACAA AGATTTGGTT GTACCTGAAA AATGGGAAGA GTCGGGACCA CAGTTTATTA 600 

CCAATTCTGA GGAAGTCCGC CTTCGTTCAT TTACTACTAC AATCCACAAA GTAAATAGCA 660 

TGGTGGCCTA CAAAATTCCT GTCAATGACT GAGGATGACA TGAGGAAAAT AATGTAATTG 720 

TAATTTTGAA ATGTGGTTTT CCTGAAATCA GGTCATCTAT AGTTGATATG TTTTATTTCA 7 B0 

TTGGTTAATT TTTACATGGA GAAAACCAAA ATGATACTTA CTGAACTGTG TGTAATTGTT 840 

CCTTTATTTT TTTGGTACCT ATTTGACTTA CCATGGAGTT AACATCATGA ATTTATTGCA 90 0 

CATTGTTCAA AAGGAACCAG GAGGTTTTTT TGTCAACATT GTGATGTATA TTCCTTTGAA 960 

GATAGTAACT GTAGATGGAA AAACTTGTGC TATAAAGCTA GATGCTTTCC TAAATCAGAT 1020 

GTTTTGGTCA AGTAGTTTGA CTCAGTATAG GTAGGGAGAT ATTTAAGTAT AAAATACAAC 1080 

AAAGGAAGTC TAAATATTCA GAATCTTTGT TAAGGTCCTG AAAGTAACTC ATAATCTATA 1140 

AACAATGAAA TATTGCTGTA TAGCTCCTTT TGACCTTCAT TTCATGTATA GTTTTCCCTA 1200 

TTGAATCAGT TTCCAATTAT TTGACTTTAA TTTATGTAAC TTGAACCTAT GAAGCAATGG 1260 

ATATTTGTAC TGTTTAATGT TCTGTGATAC AGAACTCTTA AAAATGTTTT TTCATGTGTT 1320 

TTATAAAATC AAGTTTTAAG TGAAAGTGAG GAAATAAAGT TAAGTTTGTT TTAAAAAAAA 1380 
AAAAAAAAAA 



Seq ID NO: 259 Protein sequence: 
Protein Accession #: NP_002349.1 

1 11 21 31 41 SI 

I I I I I 1 

MALQLSREQG ITLRGSAEIV AEFFSFGINS ILYQRGIYPS ETFTRVQKYG LTLLVTTDLE 60 
LIKYLNNWE QLKDWLYKCS VQKLVWISN IESGEVLERW QFDIECDKTA KDDSAPREKS 120 
QKAIQDEIRS VIRQITATVT FLPLLEVSCS FDLLIYTDKD LWPEKWEES GPQFITMSEE 180 
VRLRSFTTTI HKVNSMVAYK IPVND 



Seq ID NO: 260 DNA sequence 
Nucleic Acid Accession #: NM_001211 
Coding sequence: 43.. 3195 

1 11 21 31 41 51 

I I I I I I 

AAAGGCCTGC AGCAGGACGA GGACCTGAGC CAGGAATGCA GGATGGCGGC GGTGAAGAAG 60 
GAAGGGGGTG CTCTGAGTGA AGCCATGTCC CTGGAGGGAG ATGAATGGGA ACTGAGTAAA 120 
GAAAATGTAC AACCTTTAAG GCAAGGGCGG ATCATGTCCA CGCTTCAGGG AGCACTGGCA 180 
CAAGAATCTG CCTGTAACAA TACTCTTCAG CAGCAGAAAC GGGCATTTGA ATATGAAATT 240 



286 



WO 02/086443 

CGATTTTACA CTGGAAATGA CCCTCTGGAT GTTTGGGATA GGTATATCAG CTGGACAGAG 300 
CAGAACTATC CTCAAGGTGG GAAAGAGAGT AATATGTCAA CGTTATTAGA AAGAGCTGTA 360 
GAAGCACTAC AAGGAGAAAA ACGATATTAT AGTGATCCTC GATTTCTCAA TCTCTGGCTT 420 
AAATTAGGGC GTTTATGCAA TGAGCCTTTG GATATGTACA GTTACTTGCA CAACCAAGGG 480 
ATTGGTGTTT CACTTGCTCA GTTCTATATC TCATGGGCAG AAGAATATGA AGCTAGAGAA 540 
AACTTTAGGA AAGCAGATGC GATATTTCAG GAAGGGATTC AACAGAAGGC TGAACCACTA 600 
GAAAGACTAC AGTCCCAGCA CCGACAATTC CAAGCTCGAG TGTCTCGGCA AACTCTGTTG 660 
GCACTTGAGA AAGAAGAAGA GGAGGAAGTT TTTGAGTCTT CTGTACCACA ACGAAGCACA 720 
CTAGCTGAAC TAAAGAGCAA AGGGAAAAAG ACAGCAAGAG CTCCAATCAT CCGTGTAGGA 780 
GGTGCTCTCA AGGCTCCAAG CCAGAACAGA GGACTCCAAA ATCCATTTCC TCAACAGATG 840 
CAAAATAATA GTAGAATTAC TGTTTTTGAT GAAAATGCTG ATGAGGCTTC TACAGCAGAG 900 
TTGTCTAAGC CTACAGTCCA GCCATGGATA GCACCCCCCA TGCCCAGGGC CAAAGAGAAT 960 

GAGCTGCAAG CAGGCCCTTG GAACACAGGC AGGTCCTTGG AACACAGGCC TCGTGGCAAT 1020 

ACAGCTTCAC TGATAGCTGT ACCCGCTGTG CTTCCCAGTT TCACTCCATA TGTGGAAGAG 1080 

ACTGCACAAC AGCCAGTTAT GACACCATGT AAAATTGAAC CTAGTATAAA CCACATCCTA 1140 

AGCACCAGAA AGCCTGGAAA GGAAGAAGGA GATCCTCTAC AAAGGGTTCA GAGCCATCAG 1200 

CAAGCGTCTG AGGAGAAGAA AGAGAAGATG ATGTATTGTA AGGAGAAGAT TTATGCAGGA 1260 

GTAGGGGAAT TCTCCTTTGA AGAAATTCGG GCTGAAGTTT TCCGGAAGAA ATTAAAAGAG 1320 

CAAAGGGAAG CCGAGCTATT GACCAGTGCA GAGAAGAGAG CAGAAATGCA GAAACAGATT 1380 

GAAGAGATGG AGAAGAAGCT AAAAGAAATC CAAACTACTC AGCAAGAAAG AACAGGTGAT 1440 

CAGCAAGAAG AGACGATGCC TACAAAGGAG ACAACTAAAC TGCAAATTGC TTCCGAGTCT 1500 

CAGAAAATAC CAGGAATGAC TCTATCCAGT TCTGTTTGTC AAGTAAACTG TTGTGCCAGA 1560 

GAAACTTCAC TTGCGGAGAA CATTTGGCAG GAACAACCTC ATTCTAAAGG TCCCAGTGTA 1620 

CCTTTCTCCA TTTTTGATGA GTTTCTTCTT TCAGAAAAGA AGAATAAAAG TCCTCCTGCA 1680 

GATCCCCCAC GAGTTTTAGC TCAACGAAGA CCCCTTGCAG TTCTCAAAAC CTCAGAAAGC 1740 

ATCACCTCAA ATGAAGATGT GTCTCCAGAT GTTTGTGATG AATTTACAGG AATTGAACCC 1800 

TTGAGCGAGG ATGCCATTAT CACAGGCTTC AGAAATGTAA CAATTTGTCC TAACCCAGAA 1860 

GACACTTGTG ACTTTGCCAG AGCAGCTCGT TTTGTATCCA CTCCTTTTCA TGAGATAATG 1920 

TCCTTGAAGG ATCTCCCTTC TGATCCTGAG AGACTGTTAC CGGAAGAAGA TCTAGATGTA 1980 

AAGACCTCTG AGGACCAGCA GACAGCTTGT GGCACTATCT ACAGTCAGAC TCTCAGCATC 2040 

AAGAAGCTGA GCCCAATTAT TGAAGACAGT CGTGAAGCCA CACACTCCTC TGGCTTCTCT 2100 

GGTTCTTCTG CCTCGGTTGC AAGCACCTCC TCCATCAAAT GTCTTCAAAT TCCTGAGAAA 2160 

CTAGAACTTA CTAATGAGAC TTCAGAAAAC CCTACTCAGT CACCATGGTG TTCACAGTAT 2220 

CGCAGACAGC TACTGAAGTC CCTACCAGAG TTAAGTGCCT CTGCAGAGTT GTGTATAGAA 2280 

GACAGACCAA TGCCTAAGTT GGAAATTGAG AAGGAAATTG AATTAGGTAA TGAGGATTAC 2340 

TGCATTAAAC GAGAATACCT AATATGTGAA GATTACAAGT TATTCTGGGT GGCGCCAAGA 2400 

AACTCTGCAG AATTAACAGT AATAAAGGTA TCTTCTCAAC CTGTCCCATG GGACTTTTAT 2460 

ATCAACCTCA AGTTAAAGGA ACGTTTAAAT GAAGATTTTG ATCATTTTTG CAGCTGTTAT 2520 

CAATATCAAG ATGGCTGTAT TGTTTGGCAC CAATATATAA ACTGCTTCAC CCTTCAGGAT 2580 

CTTCTCCAAC ACAGTGAATA TATTACCCAT GAAATAACAG TGTTGATTAT TTATAACCTT 2640 

TTGACAATAG TGGAGATGCT ACACAAAGCA GAAATAGTCC ATGGTGACTT GAGTCCAAGG 2700 

TGTCTGATTC TCAGAAACAG AATCCACGAT CCCTATGATT GTAACAAGAA CAATCAAGCT 2760 

TTGAAGATAG TGGACTTTTC CTACAGTGTT GACCTTAGGG TGCAGCTGGA TGTTTTTACC 2820 

CTCAGCGGCT TTCGGACTGT ACAGATCCTG GAAGGACAAA AGATCCTGGC TAACTGTTCT 2880 

TCTCCCTACC AGGTAGACCT GTTTGGTATA GCAGATTTAG CACATTTACT ATTGTTCAAG 2940 

GAACACCTAC AGGTCTTCTG GGATGGGTCC TTCTGGAAAC TTAGCCAAAA TATTTCTGAG 3000 

CTAAAAGATG GTGAATTGTG GAATAAATTC TTTGTGCGGA TTCTGAATGC CAATGATGAG 3060 

GCCACAGTGT CTGTTCTTGG GGAGCTTGCA GCAGAAATGA ATGGGGTTTT TGACACTACA 3120 

TTCCAAAGTC ACCTGAACAA AGCCTTATGG AAGGTAGGGA AGTTAACTAG TCCTGGGGCT 3180 

TTGCTCTTTC AGTGAGCTAG GCAATCAAGT CTCACAGATT GCTGCCTCAG AGCAATGGTT 3240 

GTATTGTGGA ACACTGAAAC TGTATGTGCT GTAATTTAAT TTAGGACACA TTTAGATGCA 3300 

CTACCATTGC TGTTCTACTT TTTGGTACAG GTATATTTTG ACGTCACTGA TATTTTTTAT 3360 

ACAGTGATAT ACTTACTCAT GGCCTTGTCT AACTTTTGTG AAGAACTATT TTATTCTAAA 3420 

CAGACTCATT ACAAATGGTT ACCTTGTTAT TTAACCCATT TGTCTCTACT TTTCCCTGTA 3 4 BO 

CTTTTCCCAT TTGTAATTTG TAAAATGTTC TCTTATGATC ACCATGTATT TTGTAAATAA 3540 
TAAAATAGTA TCTGTTAAAA AAAAAAAAAA AAAAAAAAAA AAA 

Seq ID NO; 261 Protein sequence: 
Protein Accession #: NP_0012Q2 

1 11 21 31 41 51 

| | ] I I I 

MAAVKKEGGA LSEAMSLEGD EWELSKEKVQ PLRQGRIMST LQGALAQESA CNNTLQQQKR 60 
AFEYEIRFYT GNDPLDVWDR YISWTEQNYP QGGKESNMST LLERAVEALQ GEKRYYSDPR 120 
FLNLWLKLGR LCNEPLDMYS YLHNQGIGVS LAQFYISWAE EYEARENFRK ADAIFQEGIQ 180 
QKAEPLERLQ SQHRQFQARV SRQTLLALEK EEEEEVFESS VPQRSTLAEL KSKGKKTARA 240 
PIIRVGGALK APSQNRGLQN PFPQQMQNNS RITVFDENAD EASTAELSKP TVQPWIAPPM 300 
PRAKENELQA GPWNTGRSLE HRPRGNTASL IAVPAVLPSF TPYVEETAQQ PVMTPCKIEP 360 
SINHILSTRK PGKEEGDPLQ RVQSHQQASE EKKEKMMYCK EKIYAGVGEF SFEEIRAEVF 420 
RKKLKEQREA ELLTSAEKRA EMQKQIEEME KKLKEIQTTQ QERTGDQQEE TMPTKETTKL 480 
QIASESQKIP GMTLSSSVCQ VNCCARETSL AENIWQEQPH SKGPSVPFSI FDEFLLSEKK 540 
NKSPPADPPR VLAQRRPLAV LKTSESITSN EDVSPDVCDE FTGIEPLSED AIITGFRNVT 600 
ICPNPEDTCD FARAARFVST PFHEIMSLKD LPSDPERLLP EEDLDVKTSE DQQTACGTIY 660 
SQTLSIKKLS PZIEDSREAT HSSGFSGSSA SVASTSSIKC LQIPEKLELT NETSENPTQS 720 
PWCSQYRRQL LKSLPELSAS AELCIEDRPM PKLEIEKEIE LGNEDYCIKR EYLICEDYKL 780 
PWVAPRNSAE LTVIKVSSQP VPWDFYINLK LKERLNEDFD HFCSCYQYQD GCIVWHQYIN 840 
CFTLQDLLQH SEYITHEITV LIIYNLLTIV EMLHKAEIVH GDLSPRCLIL RNRIHDPYDC 900 
NKNNQALKIV DFSYSVDIiRV QLDVFTLSGF RTVQILEGQK ILANCSSPYQ VDLFGIADLA 960 

HLLLFKEHLQ VFWDGSFWKL SQNISELKDG ELWNKFFVRI LNANDEATVS VLGELAAEMN 1020 
GVFDTTFQSH LNKALWKVGK LTSPGALLFQ 

Seq ID MO: 262 DMA sequence 
Nucleic Acid Accession ft: NMJ)03784 
Coding sequence: 3 65.. 1507 



1 11 21 31 41 51 



287 



WO 02/086443 
I I I I I I 

GTCTACTTAT CAATAAGCAG CTQCCTGTGC AGAGTGCAGG CTGCACCTTT GGACAGCCTT 60 

TAAAACTGAA TTCTCAGAAT TTTAGAACAA ATTTTTGTCT AGAAATGCTG ACTTTGGTTC 120 

ATTAGGTAGT GGTAAAACAG GCTCCCTTCG AAGCTCTCCT TCATCACCTT CCTAAGTGCA 1B0 

TGTACAGGGA AGCTCTCCTT CATCACCTTC CTAAGTGCAT GGGGGAAAAT ACCTAGGGCT 240 

CAACAGTCTT GAGAAGTGTG GAAACATTTT CTTTGTGAGT GAGAACAGAT CACCTAGAGA 300 

AAGGAAACCA GATTCCCATC ACTG C T T C T G GGTATCAGAT GCTAGCGCTG CACTCCATTT 360 

TGCAATGGCC TCCCTTGCTG CAGCAAATGC AGAGTTTTGC TTCAACCTGT TCAGAGAGAT 420 

GGATGACAAT CAAGGAAATG GAAATGTGTT CTTTTCCTCT CTGAGCCTCT TCGCTGCCCT 480 

GGCCCTGGTC CGCTTGGGOG CTCAAGATGA CTCCCTCTCT CAGATTGATA AGTTGCTTCA 540 

TGTTAACACT GCCTCAGGAT ATGGAAACTC TTCTAATAGT CAGTCAGGGC TCCAGTCTCA 600 

ACTGAAAAGA GTTTTTTCTG ATATAAATGC ATCCCACAAG GATTATGATC TCAGCATTGT 660 

GAATGGGCTT TTTGCTGAAA AAGTGTATGG CTTTCATAAG GACTACATTG AGTGTGCCGA 720 

AAAATTATAC GATGCCAAAG TGGAGCGAGT TGACTTTACG AATCATTTAG AAGACACTAG 780 

ACGTAATATT AATAAGTGGG TTGAAAATGA AACACATGGC AAAATCAAGA ACGTGATTGG 840 

TGAAGGTGGC ATAAGCTCAT CTGCTGTAAT GGTGCTGGTG AATGCTGTGT ACTTCAAAGG 900 

CAAGTGGCAA TCAGCCTTCA CCAAGAGCGA AACCATAAAT TGCCATTTCA AATCTCCCAA 960 

GTGCTCTGGG AAGGCAGTCG CCATGATGCA TCAGGAACGG AAGTTCAATT TGTCTGTTAT 1020 

TGAGGACCCA TCAATGAAGA TTCTTGAGCT CAGATACAAT GGTGGCATAA ACATGTACGT 1080 

TCTGCTGCCT GAGAATGACC TCTCTGAAAT TGAAAACAAA CTGACCTTTC AGAATCTAAT 1140 

GGAATGGACC AATCCAAGGC GAATGACCTC TAAGTATGTT GAGGTATTTT TTCCTCAGTT 1200 

CAAGATAGAG AAGAATTATG AAATGAAACA ATATTTGAGA GCCCTAGGGC TGAAAGATAT 1260 

CTTTGATGAA TCCAAAGCAG ATCTCTCTGG GATTGCTTCG GGGGGTCGTC TGTATATATC 1320 

AAGGATGATG CACAAATCTT ACATAGAGGT CACTGAGGAG GGCACCGAGG CTACTGCTGC 1380 

CACAGGAAGT AATATTGTAG AAAAGCAACT CCCTCAGTCC ACGCTGTTTA GAGCTGACCA 1440 

CCCATTCCTA TTTGTTATCA GGAAGGATGA CATCATCTTA TTCAGTGGCA AAGTTTCTTG 1500 

CCCTTGAAAA TCCAATTGGT TTCTGTTATA GCAGTCCCCA CAACATCAAA GRACCACCAC 1560 

AAGTCAATAG ATYTGRGTTT AATTGGAAAA ATGTGGTGTT TCCTTTGAGT TTATTTCTTC 1620 

CTAACATTGG TCAGCAGATG ACACTGGTGA CTTGACCCTT CCTAGACACC TGGTTGATTG 1680 

TCCTGATCCC TGCTCTTAGC ATTCTACCAC CATGTGTCTC ACCCATTTCT AATTTCATTG 1740 

TCTTTCTTCC CACGCTCATT TCTATCATTC TCCCCCATGA CCCGTCTGGA AATTATGGAG 1800 

RGTGCTCAAC TGGTAAGGAG AACGTAGAAG TAGCCCTAGG GATCCTTTTT GAAACTCTAC 1860 

AGTTATCGCA GATATTCTAG CTTCATTGTA AGCAATCTAG GAAATAAGCC CTGCTGCTTT 1920 

CTAGAAATAA GTGTGAAGGA TAAATTTTCT TTGTTGACCT ATGAAGATTT TAGAGTTTAC 1980 

CTTCATATGT TTGATTTTAA ATCAGTGTAT AATCTAGATG GTAAAAAATG TGAAATTGGG 2040 

ATTAGGGACC TACCAAAATA TTTCATTAAT GCTTTCAATT GACAAATTTT GGCCTTTCTT 2100 

TGATAAGACA ATATGTACAT GTTTTTTCAA ATATTAAAGA TCTTTTAACT GTTGGCAGTT 2160 

GTTATCTACA GAATCATATT TCATATGCTG TGTAGTTTAT AAGTTTTTCC TCTATTTATC 2220 
AGAATAAAGA AATACAACAT ACCTGTAAA 

Seq ID NO: 263 Protein sequence: 
Protein Accession ft: NP 003775 



1 11 21 31 

I I I I 

MASLAAANAE FCFNLFREMD DNQGNGNVFF SSLSLFAALA 
NTASGYGNSS NSQSGLQSQL KRVFSDINAS HKDYDLSIVN 
LYDAKVERVD FTNHLEDTRR NINKWVENET HGKIKNVIGE 
WQSAFTKSET INCHFKSPKC SGKAVAMMHQ ERKFNLSVIE 
LPENDLSEIE NKLTFQNLME WTNPRRMTSK YVEVFFPQFK 
DESKADLSGI ASGGRLYISR MMHKSYIEVT EEGTEATAAT 
FLFVIRKDDI ILFSGKVSCP 

Seq ID NO: 264 DNA sequence 
Nucleic Acid Accession fh AB052906 
Coding sequence: 74-814 

1 11 21 31 41 51 

I I I I I I 

AAAACCTTGA GGTGATTCAT CTTCCAGGCT CTCCTTCCAT CAAGTCTCTC CTCCCTAGCG ' 60 

CTCTGGGTCC TTAATGGCAG CAGCCGCCGC TACCAAGATC CTTCTGTGCC TCCCGCTTCT 120 

GCTCCTGCTG TCCGGCTGGT CCCGGGCTGG GCGAGCCGAC CCTCACTCTC TTTGCTATGA 180 

CATCACCGTC ATCCCTAAGT TCAGACCTGG ACCACGGTGG TGTGCGGTTC AAGGCCAGGT 240 

GGATGAAAAG ACTTTTCTTC ACTATGACTG TGGCAACAAG ACAGTCACAC CTGTCAGTCC 300 

CCTGGGGAAG AAACTAAATG TCACAACGGC CTGGAAAGCA CAGAACCCAG TACTGAGAGA 360 

GGTGGTGGAC ATACTTACAG AGCAACTGCG TGACATTCAG CTGGAGAATT ACACACCCAA 420 

GGAACCCCTC ACCCTGCAGG CCAGGATGTC TTGTGAGCAG AAAGCTGAAG GACACAGCAG 480 

TGGATCTTGG CAGTTCAGTT TCGATGGGCA GATCTTCCTC CTCTTTGACT CAGAGAAGAG 540 

AATGTGGACA ACGGTTCATC CTGGAGCCAG AAAGATGAAA GAAAAGTGGG AGAATGACAA 600 

GGTTGTGGCC ATGTCCTTCC ATTACTTCTC AATGGGAGAC TGTATAGGAT GGCTTGAGGA 660 

CTTCTTGATG GGCATGGACA GCACCCTGGA GCCAAGTGCA GGAGCACCAC TCGCCATGTC 720 

CTCAGGCACA ACCCAACTCA GGGCCACAGC CACCACCCTC ATCCTTTGCT GCCTCCTCAT 780 

CATCCTCCCC TGCTTCATCC TCCCTGGCAT CTGAGGAGAG TCCTTTAGAG TGACAGGTTA 840 

AAGCTGATAC CAAAAGGCTC CTGTGAGCAC GGTCTTGATC AAACTCGCCC TTCTGTCTGG 900 

CCAGCTGCCC ACGACCTACG GTGTATGTCC AGTGGCCTCC AGCAGATCAT GATGACATCA 960 

TGGACCCAAT AGCTCATTCA CTGCCTTGAT TCCTTTTGCC AACAATTTTA CCAGCAGTTA 1020 

TACCTAACAT ATTATGCAAT TTTCTCTTGG TGCTACCTGA TGGAATTCCT GCACTTAAAG 1080 

TTCTGGCTGA CTAAACAAGA TATATCATTT TCTTTCTTCT CTTTTTGTTT GGAAAATCAA 1140 

GTACTTCTTT GAATGATGAT CTCTTTCTTG CAAATGATAT TGTCAGTAAA ATAATCACGT 1200 

TAGACTTCAG ACCTCTGGGG ATTCTTTCCG TGTCCTGAAA GAGAATTTTT AAATTATTTA 1260 

ATAAGAAAAA ATTTATATTA ATGATTGTTT CCTTTAGTAA TTTATTGTTC TGTACTGATA 1320 
TTTAAATAAA GAGTTCTATT TCCCAAAAAA AAAAAAAAAA A 

Seq ID NO: 265 Protein sequences 
Protein Accession #: BAB61048.1 



41 51 
I I 

&VRLGAQDDS LSQIDKLIiHV 60 

GLFAEKVYGF HKDYIECAEK 120 

GGISSSAVMV LVNAVYFKGK 180 

DPSMKILELR YNGGINMYVL 240 

IEKNYEMKQY LRALGLKDIF 300 

GSNIVEKQbP QSTLFRADHP 360 



288 



WO 02/086443 

1 11 21 31 41 51 

I I I I I I 

MAAAAATKIL LCLPLLbLLS GWSRAGRADP HSLCYDITVI PKFRPGPRWC AVQGQVDEKT 
FLHYDCGNKT VTPVSPLGKX LNVTTAWKAQ NPVLREWDI LTEQLRDIQL ENYTPKBPLT 
LQARMSCBQK AEGHSSGSWQ FSFDGQIPLL FDSEKRMWTT VHPGARKMKE KWENDKWAM 
SFHYFSMGDC IGWLEDFLMG MDSTLEPSAG APLAMSSGTT QLRATATTLI LCCLLIILPC 
FILPGI 

Seg ID NO: 266 DNA sequence 

Nucleic Acid Accession 8: XM_084853.1 

Coding sequence: 127-444 

1 11 21 31 41 51 

I I I I I I 

ATTGATGATA TATTTAACGA AATCAAATTT GGTGAATATG TGGACACTGG AAA GCTA ATC 
GACAAGATCA ACTTACCAGA TTTCCTAAAA GTGTACCTTA ACCACAAGCC ACCTTTTGGT 
AACACCATGA GTGGCATCCA CAAGAGCTTT GAGGTGCTCG GTTATACCAA CTCCAAAGGG 
AAAAAGGCCA TTCGAAGAGA GGACTTCCTG AGACTGCTCG TTACTAAAGG TGAGCATATG 
ACGGAGGAGG AGATGTTGGA TTGCTTTGCT TCACTGTTTG GCCTGAATCC CGAGGGATGG 
AAATCCGAGC CTGCAACCTG CTCCGTCAAA GGTTCAGAAA TTTGCCTTGA AGAAGAACTT 
CCAGACGAAA TCACTGCAGA AATATTCGCG ACTGAAATTC TTGGCTTAAC CATTTCAGAA 
GATTCCGGCC AGGATGGTCA GTGAAGTTAC CAGGAATGTT TAAAGCACAA AGGACTTTGG 
GTGTGTGTGC ATGCACATGT GTGTGTTTTC - CATGAGGCAC TGCTTTTTAT GCATTTCCCT 
CCCCCCTCTC ATCTTTAGAA CATTTAGACA TTAAAGCAAG TTTCTGGTGA GCAATG 



Seq ID NO: 267 Protein sequence: 
Protein Accession ft: XP_084853.1 

1 11 21 31 41 51 

I I I I I I 

MSGIHKSFEV LGYTNSKGKK AIRREDFLRL LVTKGEHMTE EEMLDCFASL FGLNPEGWKS 
EPATCSVKGS EICLEEELPD EITAEIFATE ILGLTISEDS GQDGQ 



Seq ID NO: 268 DNA sequence 
Nucleic Acid Accession &: NMJ)01898 
Coding sequence: 57-482 



1 11 21 

I I I 

GGCTCTCACC CTCCTCTCCT GCAGCTCCAG 
CCCAGTATCT GAGTACCCTG CTGCTCCTGC 
GCCCCAAGGA GGAGGATAGG ATAATCCCGG 
AGTGGGTACA GCGTGCCCTT CACTTCGCCA 
ACTACTACAG ACGTCCGCTG CGGGTACTAA 
ATTACTTCTT CGACGTAGAG GTGGGCCGCA 
ACACCTGTGC CTTCCATGAA CAGCCAGAAC 
TCTACGAAGT TCCCTGGGAG AACAGAAGGT 
AGGGATCTGT GCCAGGCCAT TCGCACCAGC 
CCACCCCTGG ACTGGTGGCC CCCACCCTGC 
GACAGACAGA GAAGGCTGCA GGAGTCCTTT 
CTTCCTTCTT GCTTCTAATA GCCCTGGTAC 
AAACAGTAGC ATCGCC 



31 41 51 

I I I 

CTTTGTGCTC TGCCTCTGAG GAGACCATGG 
TGGCCACCCT AGCTGTGGCC CTGGCCTGGA 
GTGGCATCTA TAACGCAGAC CTCAATGATG 
TCAGCGAGTA TAACAAGGCC ACCAAAGATG 
GAGCCAGGCA ACAGACCGTT GGGGGGGTGA 
CCATATGTAC CAAGTCCCAG CCCAACTTGG 
TGCAGAAGAA ACAGTTGTGC TCTTTCGAGA 
CCCTGGTGAA ATCCAGGTGT CAAGAATCCT 
CACCACCCAC TCCCACCCCC TGTAGTGCTC 
GGGAGGCCTC CCCATGTGCC TGCGCCAAGA 
GTTGCTCAGC AGGGCGCTCT GCCCTCCCTC 
ATGGTACACA CCCCCCCACC TCCTGCAATT 



Seq ID NO: 269 Protein sequence: 
Protein Accession # :NP_001889. 1 

1 11 21 31 41 51 

1 1-1 I I I 

MAQYLSTLLL LLATLAVALA WSPKEEDRII PGGIYNADLN DEWVQRALHF AISEYNKATK 
DDYYRRPLRV LRARQQTVGG VNYFFDVEVG RTICTKSQPN LDTCAFHEQP ELQKKQLCSF 
EIYEVPWENR RSLVKSRCQE S 

Seq ID NO: 270 DNA sequence 
Nucleic Acid Accession #: XM_093210 
Coding sequence: 13-1854 

1 11 21 31 

I I I I 

ATGGCAAGCG CCGGAATCTC CTCAGCTGCC GTTTCACAAA 
AAACGAGCAC ACAAGCAGCA CCAGGAGCTG CAGAAGAAGG 
GGCAGAGGGA ATGGGGAGGG GGCATCCTAC CCCATATCTG 
GAGCGGACTG GGCCTTTCCC GTTGGCGCGT GGCCTCAATC 
GCCTTCAAAA CGGTAAGAGC TGCAACTGAA CGTGTGAGAC 
GGCGGCGGGA GAGATGCCCA TGAACTCAAG TACCCGGACA 
ACGAGTAACA CCGCCCCCAC GGGACCGCTC TCGAGGTCCC 
GGAACGCCCC GGCGCGCGGC CAGCAGCGGC GGGCACCGGC 
CACTGGCAGT CGGCCCTCCT CACACCGCAG GCGTGCAGTG 
GCCGAGGACC CAGCTAGGCC GTCACCCCGG TTGCTCCCAC 
CTGCCCAAGG CCCCGAGCCC AGGCTCCCTG GCGGAGGCCT 
ATGGCCGCCA CCAGGCTCCC GAGCCATGGC TTCCTGTCCG 
CTGTCCAGCT AG 



41 51 

I I 

AGAGGTACCA GGTCCGCACC 
AGGCGGCAGC GATGGACCAG 
AGGTGCGACT GCGGGACGTA 
AGOACTTCTT GCCCACGTGC 
ATGGTGCAGA TAGGCTGAGA 
CGCCCTCCAC TTCTACCACC 
CCAAGCCAAG GACGCAAGGA 
CCAATGGCCA CGGAACTCAG 
TGGCCGAOGG AGCCTCCCGG 
GGGAAGGGGC ACCAGGCAAA 
CCGCTGGTCC CGCCCAGATC 
GGAACGCCCC GGCGTCCTGG 



Seq ID NO: 271 Protein sequence: 
Protein Accession #i XP_093210 

1 11 21 31 41 51 



289 



10 



15 



20 



WO 02/086443 

I I 

MLRHGEQKRK RARKKWDFLP 
TTTSNTAPTG PLSRSPKPRT 
SRAEDPARPS PRLLPREGAP 
PNSSVGRKEB RPGAGQQRRA 
LPREGAPGKL PKAPSPGSLA 



I I 
TCAFKTVRAA TERVRHGADR 
QGGTPRRRPA AAGTRANGHG 
GKLPKAPSPG SIiAEASAGLL 
PAPMATELST GSRPSSHRRR 
EASAGPAQIM AATRLPSRGP 



I I 

LRGGGRDAHE LKYPDTPSTS 60 

TQHWQSALLT PQACSVADGA 120 

AHVRLQNADA QRVSISQALP 180 

AVWPTEPPGP RTQLEPSPRL 240 
LSGNGPASWL SS 



PCT/US02/12476 



Seq ID NO: 272 DNA sequence 

Nucleic Acid Accession ft: Eos sequence 

Coding sequence: 1..732 



i 

GGATACTGTG 
TGAAAAAGCT 
TAATGTGGAG 
ATACCCACTT 
ATGATTTTGT 
TAAATTATTT 
TTAGTATCAC 
AAAATTGCAG 
TTAAGCC 



11 

I 

TCACTCAAAG 
TTTTTTCCCA 
GAAATTATTC 
GAAGCCTCTG 
CTTGTTTCTG 
TTATTTATCT 
AATTTATGGG 
AAGTCATAGG 



21 

I 

TAATGGGAGG 
CTTTTAACTT 
TTTCTCATTG 
TAGAAATGTC 
CAGTGAGAAA 
TTCATATAGT 
AGAGGGTTTT 
ACTGTCATGT 



31 
I 

GAGAGAGAAC 
GCTTTAGCGT 
GAGATTACAG 
TCGTCCTCCG 
TTACATCCAT 
TCTTACAATT 
TTGTATTTTT 
ATTGCAGCTC 



41 
I 

AGGGAGGGTA 
TAAGAGTACT 
AATATATCTA 
GTTGTATTTC 
AGCAAAGACA 
TCTAAAAAAT 
AAGCATATGT 
TGAGAACCAA 



51 
I 

GGGATGCTTT 
TACCAGCTAA 
TTCATCTTGA 
TAAAACCTAC 
AAAGTCTTTT 
TAACACTCAT 
GGCTTATATA 
TGCCTGAAAC 



60 
120 
180 
240 
300 
360 
420 
480 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NO: 273 Protein sequence: 
Protein Accession ft: Eos sequence 



11 



21 

I 



MGGRENREGR DAFEKAFFPT PNLL 



31 
I 



41 
I 



51 



Seq ID NO: 274 DNA sequence 

Nucleic Acid Accession ft: NM_003976.2 

Coding sequence: 299-961 



1 

I 

CTCTGAGCTT 
CATGGAGTTG 
CTACTTCTGC 
GGGTGGCAGG 
CAGGAGGGTG 
GGAACTTGGA 
TGCCCTGTGG 
GGGCTCCGCG 
CGCCGGCCAC 
GCCGCCGCAG 
CGGGGGCCGC 
GGGCTGCCGC 
CGACGAGCTG 
CGACCTCAGC 
GCCCGTCAGC 
CAACAGCACC 
AGGGCTCGCT 
CCTCCCGCAG 
AGGCCCCTAC 
CAGCCCCAGA 
GGAGCCCTTC 
CCCTCCTCTG 
ACAGCATTTG 
CCTGTACTCA 



11 
I 

CTCTGAGCCT 
TGAAAGAATA 
TGGGTTGAGT 
CCGGTCCCCC 
GGGGAACAGC 
CTTGGAGGCC 
CCCACCCTGG 
CCCCGCAGCC 
CTGCCGGGGG 
CCTTCTCGGC 
GCGGCGCGGG 
CTGCGCTCGC 
GTGCGTTTCC 
CTGGCCAGCC 
CAGCCCTGCT 
TGGAGAACCG 
CCAGGGCTTT 
AGTCCCACTA 
CGGTGGGTGA 
GCCCTCACCC 
GGACCCACTT 
ATGAACACTA 
AAGGACACAT 
CTCATGGGAG 



21 

I 

TGTTTGCTCA 
GCTGCAAAGC 
CTAGCTGTGT 
ACAAAAGATA 
TCAACAATGG 
TCTCCACaCT 
CCGCTCTGGC 
CTGCCCCCCG 
GACGCACGGC 
CCGCGCCCCC 
CTGGGGGCCC 
AGCTGGTGCC 
GCTTCTGCAG 
TACTGGGCGC 
GCCGACCCAC 
TGGACCGCCT 
GCAGACTGGA 
GCCAGCGGCC 
TGGATATCAT 
TGCGGATCCC 
CTCACAGACT 
CAGTGGCTGA 
ATTGCAGTTG 
CTGGCCCC 



31 
I 

TCTGGAAAAA 
ACCTAACACA 
AGGCCCCTTG 
ACTCATCTCT 
CTGATGGGCG 
GTCCCACTGC 
TCTGCTGAGC 
CGAAGGCCCC 
CCGCTGGTGC 
GCCGCCTGCA 
GGGCAGCCGC 
GGTGCGCGCG 
CGGCTCCTGC 
CGGGGCCCTG 
GCGCTACGAA 
CTCCGCCACC 
CCCTTACCGG 
TCAGCCAGGG 
CCCCGAACAG 
AGCCTAAAAG 
CTGGCACTGG 
GGCATCAGCC 
CTTGGTTGAA 



41 
I 

GGGGATTAAA 
TAGTAAGGTT 
TTCCTCACCT 
TAATTTGCAA 
CTCCTGGTGT 
CCCTGGCCTA 
AGCGTCGCAG 
CCGCCTGTCC 
AGTGGAAGAG 
CCCCCATCTG 
GCTCGGGCAG 
CTCGGCCTGG 
CGCCGCGCGC 
CGACCGCCCC 
GCGGTCTCCT 
GCCTGCGGCT 
TGGCTCTTCC 
ACGAAGGCCT 
GTGAAGGGAC 
ACACCAGAGA 
CCAGGCCTCG 
CCCGCCCAGG 
AGTGCCTGTG 



51 
I 

CCATTTACCT 
CCCAGTGCAG 
GGAGAAACTG 
GCTGCCTCAA 
TGATAGAGAT 
GGCGGCAGCC 
AGGCCTCCCT 
TGGCGTCCCC 
CCCGGCGGCC 
CTCTTCCCCG 
CGGGGGCGCG 
GCCACCGCTC 
GCTCTCCACA 
CGGGCTCCCG 
TCATGGACGT 
GCCTGGGCTG 
TGCCTGGGAC 
CAAAGCTGAG 
AACTGACTAG 
CCTCAGCTAT 
AACCTGGGAC 
CCCTGTAGGG 
CTGGAACTGG 



Seq ID NO: 275 Protein sequence: 
Protein Accession ft: NP_003 967.1 

1 11 21 31 41 51 

I I I I I.I 

MELGLGGLST LSHCPWPRRQ PALWPTLAAL ALLSSVAEAS LGSAPRSPAP REGPPPVLAS 
PAGHLPGGRT ARWCSGRARR PPPQPSRPAP PPPAPPSALP RGGRAARAGG PGSRARAAGA 
RGCRLRSQLV PVRALGLGHR SDELVRFRFC SGSCRRARSP HDLSLASLLG AGALRPPPGS 
RPVSQPCCRP TRYEAVSPMD VNSTWRTVDR LSATACGCLG 

Seq ID NO: 276 DNA sequence 

Nucleic Acid Accession ft: NMJ)57091.1 

Coding sequence: 783-1445 



ACTGGCCGCT 
GGACCCCCAA 
TCGCTCCCCG 
CGCGTGTCTA 
CTCCATATCC 
CAAGCTAGGG 
CGGGGCAGGG 
CACCGGACGG 
CAGACAAGGC 



11 
I 

GAGAGAAGAA 
ATCTGCACGT 
CCCTCACTCA 
CAAACTCAAC 
GAGGGGCCCC 
GGGACTGGAT 
GCGCTCCCAG 
CTGCGGCGGC 
CCGGGGGCTC 



21 
I 

TCGGGTGGAG 
ACCAGCAGTC 
CTTTCTCCCG 
TCCCGGTTTC 
TCCCAGCATC 
CCGACGGGTG 
CCCCACCCCG 
GGGCAGGAGG 
CGCCAGCAGC 



31 
I 

CAGAGAGCAG 
AGCCGCCCCA 
CCCTCGGCCC 
CGTGCCTCTC 
TACCCCCCTC 
GAGCAGCCAG 
GGATCTGGTG 
CTGCTGAGGG 
AGGTCCCTCG 



41 
I 

CTGCTGCAGG 
CGCAGGGACC 
GGCCTCCCAG 
CACCGCTCGA 
CCAACCTCGG 
GTGAGCCCCG 
ACGCTGGGGC 
ATGGAGTTGG 
GGCCCCAGCC 



51 
I 

GCAGACAGCC 
GGCTTACCCC 
CTCTCTACTT 
GTTCTCTACT 
GGGACCTAGC 
AAAGGTGGGG 
TGGAATTTGA 
GCCCGGCCCC 
CTCGCTGCCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 



290 
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10 
15 

20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



CCCGGGCCTG 
TAAAAGAGGC 
GCCCAGCACT 
TCAACAGGAG 
AGATGGAACT 
AGCCTGCCCT 
CCCTGGGCTC 
CCCCCGCOGG 
GGCCGCCGCC 
CCCGCGGGGG 
CGGGGGGCTG 
GCTCCGACGA 
CACACGACCT 
CCCGGCCCGT 
ACGTCAACAG 
GCTGAGGGCT 
GGACCCTCCC 
TGAGAGGCCC 
CTAGCAGCCC 
CTATGGAGCC 
GGACCCCTCC 
AGGGACAGCA 
CTGGCCTGTA 



GAGCCCCACA 
ACTGCCAGGT 
GGTCCCCGGA 
GGTGGGGGAA 
TGGACTTGGA 
GTGGCCCACC 
CGCGCCCCGC 
CCACCTGCCG 
GCAGCCTTCT 
CCGCGCGGCG 
CCGCCTGCGC 
GCTGGTGCGT 
CAGCCTGGCC 
CAGCCAGCCC 
CACCTGGAGA 
CGCTCCAGGG 
GCAGAGTCCC 
CTACOGGTGG 
CAGAGCCCTC 
CTTCGGACCC 
TCTGATGAAC 
TTTGAAGGAC 
CTCACTCATG 



CCCGAGGGTG 
GTACAGTCCT 
AAGGTGCCTA 
CAGCTCAACA 
GGCCTCTCCA 
CTGGCCGCTC 
AGCCCTGCCC 
GGGGGACGCA 
CGGCCCGCGC 
CGGGCTGGGG 
TCGCAGCTGG 
TTCCGCTTCT 
AGCCTACTGG 
TGCTGCOGAC 
ACCGTGGACC 
CTTTGCAGAC 
ACTAGCCAGC 
GTGATGGATA 
ACCCTGOGGA 
ACTTCTCACA 
ACTACAGTGG 
ACATATTGCA 
GGAGCTGGCC 



CAGACTGGCT 
GGGCATGCGC 
GAAGAACAAG 
ATGGCTGATG 
CGCTGTCCCA 
TGGCTCTGCT 
CCCGCGAAGG 
CGGCCCGCTG 
CCCOGCCGCC 
GCCCGGGCAG 
TGCCGGTGOG 
GCAGCGGCTC 
GCGCCGGGGC 
CCACGCGCTA 
GCCTCTCCGC 
TGGACCCTTA 
GGCCTCAGCC 
TCATCCCCGA 
TCCCAGCCTA 
GACTCTGGCA 
CTGAGGCATC 
GTTGCTTGGT 
CC 



GCCAAGGCCA 
TGTTTGAGCT 
GTGCAGGACC 
GGCGCTCCTG 
CTGCCCCTGG 
GAGCAGCGTC 
CCCCCCGCCT 
GTGCAGTGGA 
TGCACCCCCA 
CCGCGCTCGG 
CGCGCTCGGC 
CTGCCGCCGC 
CCTGCGACCG 
CGAAGCGGTC 
CACCGCCTGC 
CCGGTGGCTC 
AGGGACGAAG 
ACAGGTGAAG 
AAAGACACCA 
CTGGCCAGGC 
AGCCCCCGCC 
TGAAAGTGCC 



CACTTTTGGC 
TCGGGGGAGA 
CCGTGCTGCC 
GTGTTGATAG 
CCTAGGCGGC 
GCAGAGGCCT 
GTCCTGGCGT 
AGAGCCCGGC 
TCTGCTCTTC 
GCAGCGGGGG 
CTGGGCCACC 
GCGCGCTCTC 
CCCCCGGGCT 
TCCTTCATGG 
GGCTGCCTGG 
TTCCTGCCTG 
GCCTCAAAGC 
GGACAACTGA 
GAGACCTCAG 
CTCGAACCTG 
CAGGCCCTGT 
TGTGCTGGAA 



Seq ID NO: 277 Protein sequence* 
Protein AcceSBion #: NP_003967.1 

1 11 21 31 41 SI 

I I I I I I 

MELGLGGLST LSHCPWPRRQ PALWPTLAAL ALLSSVAEAS LGS APRS PAP REGPPPVLAS 
PAGHLPGGRT ARWCSGRARR PPPQPSRPAP PPPAPPSALP RGGRAARAGG PGSRARAAGA 
RGCRLRSQLV PVRALGLGHR SDELVRFRFC SGSCRRARSP HDLSLASLLG AGALRPPPGS 
RPVSQPCCRP TRYEAVSFMD VNSTWRTVDR LSATACGCLG 

Seq ID NO: 278 DNA sequence 

Nucleic Acid Accession #: NM_057160.1 

Coding sequence: 1-714 



600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 



60 
120 
180 



I 

ATGCCCGGCC 
CACCTGGGTG 
TGGCCCACCC 
GCGCCCCGCA 
CACCTGCCGG 
CAGCCTTCTC 
CGCGCGGCGC 
CGCCTGCGCT 
CTGGTGCGTT 
AGCCTGGCCA 
AGCCAGCCCT 
ACCTGGAGAA 
GCTCCAGGGC 
CAGAGTCCCA 
TACCGGTGGG 
AGAGCCCTCA 
TTCGGACCCA 
CTGATGAACA 
TTGAAGGACA 
TCACTCATGG 



11 
I 

TGATCTCAGC 
CCCTCTTTCT 
TGGCCGCTCT 
GCCCTGCCCC 
GGGGACGCAC 
GGCCCGCGCC 
GGGCTGGGGG 
CGCAGCTGGT 
TCCGCTTCTG 
GCCTACTGGG 
GCTGCCGACC 
CCGTGGACCG 
TTTGCAGACT 
CTAGCCAGCG 
TGATGGATAT 
CCCTGCGGAT 
CTTCTCACAG 
CTACAGTGGC 
CATATTGCAG 
GAGCTGGCCC 



21 
I 

CCGAGGACAG 
CCCTGAGGCT 
GGCTCTGCTG 
CCGCGAAGGC 
GGCCCGCTGG 
CCCGCCGCCT 
CCCGGGCAGC 
GCCGGTGCGC 
CAGCGGCTCC 
CGCCGGGGCC 
CACGCGCTAC 
CCTCTCCGCC 
GGACCCTTAC 
GCCTCAGCCA 
CATCCCCGAA 
CCCAGCCTAA 
ACTCTGGCAC 
TGAGGCATCA 
TTGCTTGGTT 
C 



31 
\ 

CCACTTGGTC 
AGCAGCGTCG 
CCCCCGCCTG 
TGCAGTGGAA 
GCACCCCCAT 
CGCGCTCGGG 
GCGCTCGGCC 
TGCCGCCGCG 
CTGCGACCGC 
GAAGCGGTCT 
ACCGCCTGCG 
CGGTGGCTCT 
GGGACGAAGG 
CAGGTGAAGG 
AAGACACCAG 
TGGCCAGGCC 
GCCCCCGCCC 
GAAAGTGCCT 



41 
I 

AGGTCCTTCC 
TCTCCGCGCA 
CAGAGGCCTC 
TCCTGGCGTC 
GAGCCCGGCG 
CTGCTCTTCC 
CAGCGGGGGC 
TGGGCCACCG 
CGCGCTCTCC 
CCCCGGGCTC 
CCTTCATGGA 
GCTGCCTGGG 
TCCTGCCTGG 
CCTCAAAGCT 
GACAACTGAC 
AGACCTCAGC 
TCGAACCTGG 
AGGCCCTGTA 
GTGCTGGAAC 



51 
I 

TCCCCAAGCC 
GCCTGCCCTG 
CCTGGGCTCC 
CCCCGCCGGC 
GCCGCCGCCG 
COGCGGGGGC 
GCGGGGCTGC 
CTCCGACGAG 
ACACGACCTC 
COGGCCCGTC 
CGTCAACAGC 
CTGAGGGCTC 
GACCCTCCCG 
GAGAGGCCCC 
TAGCAGCCCC 
TATGGAGCCC 
GACCCCTCCT 
GGGACAGCAT 
TGGCCTGTAC 



Seq ID NO: 279 Protein sequence: 
Protein Accession #: NP 476501.1 



11 



21 



31 



41 



MPGLISARGQ PLLEVLPPQA HLGALFLPEA PLGLSAQPAL 
APRSPAPREG PPPVLASPAG HLPGGRTARW CSGRARRPPP 
RAARAGGPGS RARAAGARGC RLRSQLVPVR ALGLGHRSDE 
SLASLLGAGA LRPPPGSRPV SQPCCRPTRY EAVSFMDVNS 

Seq ID NO: 280 DNA sequence 

Nucleic Acid Accession #i NM_057090.1 

Coding sequence: 29-715 



51 
I 



WPTLAALALL SSVAEASLGS 
QPSRPAPPPP APPSALPRGG 
LVRFRFCSGS CRRARSPHDL 
TWRTVDRLSA TACGCLG 



CTGATGGGCG 
GTCCCACTGC 
GTGGCCCACC 
CGCGCCCCGC 
CCACCTGCCG 
GCAGCCTTCT 
CCGCGCGGCG 
CCGCCTGCGC 
GCTGGTGCGT 
CAGCCTGGCC 
CAGCCAGCCC 



11 
I 

CTCCTGGTGT 
CCCTGGCCTA 
CTGGCCGCTC 
AGCCCTGCCC 
GGGGGACGCA 
CGGCCCGCGC 
CGGGCTGGGG 
TCGCAGCTGG 
TTCCGCTTCT 
AGCCTACTGG 
TGCTGCOGAC 



21 
I 

TGATAGAGAT 
GGCGGCAGGC 
TGGCTCTGCT 
CCCGCGAAGG 
CGGCCCGCTG 
CCCCGCCGCC 
GCCCGGGCAG 
TGCCGGTGCG 
GCAGCGGCTC 
GCGCCGGGGC 
CCACGCGCTA 



31 
I 

GGAACTTGGA 
TCCACTTGGT 
GAGCAGCGTC 
CCCCCCGCCT 
GTGCAGTGGA 
TGCACCCCCA 
CCGCGCTCGG 
CGCGCTCGGC 
CTGCCGCCGC 
CCTGCGACCG 
CGAAGCGGTC 



41 
I 

CTTGGAGGCC 
CTCTCCGCGC 
GCAGAGGCCT 
GTCCTGGCGT 
AGAGCCCGGC 
TCTGCTCTTC 
GCAGCGGGGG 
CTGGGCCACC 
GCGCGCTCTC 
CCCCCGGGCT 
TCCTTCATGG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



60 
120 
180 



51 
I 

TCTCCACGCT 
AGCCTGCCCT 
CCCTGGGCTC 
CCCCCGCCGG 
GGCCGCCGCC 
CCCGCGGGGG 
CGCGGGGCTG 
GCTCCGACGA 
CACACGACCT 
CCCGGCCCGT 
ACGTCAACAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



291 



WO 02/086443 

CACCTGGAGA ACCGTGGACC GCCTCTCCGC CACCGCCTGC GGCTGCCTGG GCTGAGGGCT 720 

CGCTCCAGGG CTTTGCAGAC TGGACCCTTA CCGGTGGCTC TTCCTGCCTG GGACCCTCCC 780 

GCAGAGTCCC ACTAGCCAGC GGCCTCAGCC AGGGACGAAQ GCCTCAAAGC TGAGAGGCCC 840 

CTACCGGTGG GTGATGGATA TCATCCCOGA ACAGGTGAAG GGACAACTGA CTAGCAGCCC 900 

CAGAGCCCTC ACCCTGCGGA TCCCAGCCTA AAAGACACCA GAGACCTCAG CTATGGAGCC 960 

CTTCGGACCC ACTTCTCACA GACTCTGGCA CTGGCCAGGC CTOGAACCTG GGACCCCTCC 1020 

TCTGATGAAC ACTACAGTGG CTGAGGCATC AGCCCCCGCC CAGGCCCTGT AGGGACAGCA 1080 

TTTGAAGGAC ACATATTGCA GTTGCTTGGT TGAAAGTGCC TGTGCTGGAA CTGGCCTGTA 1140 
CTCACTCATG GGAGCTGGCC CC 

Seq ID NO i 281 Protein sequence i 
Protein Accession S: NP_476431.1 • 

1 11 21 31 41 51 

I 1 I I I I 

MELGLGGLST LSHCPWPRRQ APLGIiSAQPA LWPTLAAIAL LSSVAEASLG SAPRSPAPRE 60 

GPPPVLASPA GHLPGGRTAR WCSGRARRPP PQPSRPAPPP PAPPSALPRG GRAARAGGPG 120 

SRARAAGARG CRLRSQLVPV RALGLGHRSD ELVRFRFCSG SCRRARSPHD liSIASI*LGAQ 180 
ALRPPPGSRP VSQPCCRPTR YEAVSFMDVN STWRTVDRLS ATACGCLG 

Seq ID NO: 282 DNA sequence 

Nucleic Acid Accession fti Eos sequence 

1 11 21 31 41 51 

I I I I I I 

CTACTGCACC TGCCCTCTGT TTCCTTTGGA AATCTCTTAC CTTTCATTAG GGTTTCTTTC 60 

ATAGCAATTT CCTTTGGTTT TTAAGACTTC TACATTGCTT TTTCTTTTAT TATCTGTGCT 120 

COGTGAA0CT TATGAATGCT GCTTAAAAAT AATGTCAAAA TATGTTTTAG CTGCCTACTC 180 

AGGTAACGTT TTCTTTTGCT CTCATCTTGG TTTCCATATA CTATTTTTGG TTTTTTGTGA 240 

GATCTAATCA ATGATCTAGT CAGAAGCTAC TTCACTGGCT AACAGTGATC ATGTTCATGT 300 

GCTAAAAATG AACTTGAAAC ACGGAAGTAG TGGTTGGTCC AGTTTGAAAG CTCTTATTAG 360 

TATTCTTCAT CCTGGCTGTA ATAATAGCCA TTATTTGTTA TGCCTTTGTT ATGTAGCAGA 420 

CACTCTTAAG GATTTTATGT GTATTATTCA AATTGCTATT ACTGTTCTTT TTATAGTTGA 480 
GAATCTCAGG ATACCTACAT TTATCACTTT TTCAATATAT ATGTATTTCT TATT 

Seq ID NO: 283 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 564-1481 

1 11 21 31 41 51 

I I I I I I 

GAGACTTTTA ATCATCTATC CCTTGTGCTT TACGCAGACC CTACAATACA CTAGAGGCTT 60 

CAAAGAGGTC AAAAATTCAC ATGTGTAGAC AAATTAGGTC CCTTAAGATG CCAGGCAAAC 12 0 

GAAGTGCTAC CAAAACACGC AATGACTGTC CTAAAAGTGC GTTCTGGGAT ACACCTGTAA 180 

ACTTGGATCA AGTTCCCTCC CCTCTCCTCA AAATATATCG ACTTGTGCTG AAAGAAATCA 240 

CGACCGATGC TCACAATTCT GACCTCGTAA TTATATAGGG GGTGGTTTTG GTTTCTGCGT 300 

CTOTCCCTGA TTCAGTGGCA GGTAACATAT TTCATGTACA AAATGAACTG CAACACCACG - 360 

GCAAACAAGG GACAGGCCCT CAAAGTTGTC GGTAGGGAGC CAGGACCCCG CCAGTGGCGT 420 

GGGGAGACAC CGTACTAAAC AAGCTTGCAA ACAGCAGGCA CCTTCCTGCC ACTGAGGAGG 480 

AAGGGCTGGC TAAGGGAGGC CGGGGCGGAG GAAGCCAAGC TCTGCAGGCC CTGACAAAGT 540 

CCTCCCGGCC TCCACGCGTC GCCATGGCAA CGCGGGGTCT GTGCTGGCCG GGATTGGCCG 600 

GCCTGGCGCG CGCAGGGCCC GCTGGGAAAG CGCGTCCCCG CCGCGGCTCC GCCAGTTTGA 660 

ACTTGGCGGG CCAGATGTGG GCGGCGGGGC GCTGGGGGCC TACTTTTCCC TCTTCCTACG 720 

CCGGTTTCTC TGCTGACTGC AGACCCAGGT CTCGGCCCTC CTCGGACTCC TGCTCAGTCC 780 

CTATGACGGG CGCACGTGGG CAGGGGCTGG AGGTGGTGCG CTCGCCGTCG CCGCCGCTGC 840 

CGCTGAGCTG CAGCAATTCC ACCAGGTCGC TGTTGTCTCC CCTTGGCCAC CAGAGCTTCC 900 

AGTTTGACGA GGACGACGGT GAOGGGGAGG ATGAGGAAGA CGTGGATGAT GAGGAAGACG 960 

TGGATGAAGA TGCCCATGAT TCAGAGGCCA AAGTGGCGAG CCTGAGAGGA ATGGAGTTAC 1020 

•AGGGGTGCGC CAGCACTCAG GTTGAATCAG AAAATAACCA AGAAGAACAG AAACAGGTGC 1080 

GCTTACCAGA AAGCCGCCTG ACACCATGGG AGGTGTGGTT TATTGGCAAA GAAAAAGAAG 1140 

AACGTGACCG GCTGCAACTG AAAGCTCTAG AGGAATTAAA TCAACAACTA GAAAAAAGAA 1200 

AAGAAATGGA AGAACGTGAA AAAAGAAAGA TAATTGCTGA AGAAAAGCAC AAGGAATGGG 1260 

TTCAGAAAAA GAATGAGCAA AAAAGAAAAG AAAGAGAACA AAAAATTAAT AAAGAAATGG 1320 

AGGAAAAAGC AGCAAAGGAA CTGGAGAAAG AATACTTGCA AGAAAAAGCA AAAGAAAAAT 1380 

ATCAAGAATG GTTAAAGAAA AAAAATGCTG AAGAATGTGA GAGGAAGAAG AAAGAAAAGA 1440 

AAAACAACAG CAAGCTGAAA TACAGGAGAA AAAGGAAATA GCAGAAAAAA AGTTTCAAGA 1500 

ATGGTTGGAA AATGCGAAAC ATAAACCTCG TCCAGCTGCA AAGAGCTATG GTTATGCCAA 1560 

TGGAAAACTT ACAGGTTTTT ACAGTGGAAA TTCCTATCCA GAACCAGCCT TTTATAATCC 1620 

AATTCCGTGG AAACCAATTC ATATGCCACC TCCCAAAGAA GCTAAGGATC TATCAGGAAG 1680 

GAAGAGTAAA AGACCTGTGA TAAGTCAGCC ACACAAGTCA TCATCTCTGG TAATTCATAA 1740 

AGCCAGGAGC AATCTTTGCC TTGGAACTCT GTGCAGAATA CAAAGATAGC GTATGTGGAA 1600 

AATAACATGC TTTTATCTGG AGCTATTTAA TTTAAAAATC AGAAATTGTT TTTTACTGCT 1860 

CAGTCAATAA CTCAACACTT AATGTGATTA TTGACAAATA GCAATTTTTG CATTTGTATA 1920 

TGGAGTCCTT AGAGTTGAGG AAGATATTTT CTGGATTTTG GTTTTTATAA ACTTTTTAAG 1980 

GTTGATCTTG GCATGTTGTT TTGCAGAATA AGTGGCTGAA TATGTAAGAA TTGTGTTTGT 2040 
ATTTAGCTTG TATTAAAAGT ACACTGTAAT ACCAATAAAA CTAACAATTT TTCTTG 

Seq ID NOt 284 Protein sequence: 
Protein Accession #: Eos sequence 

1 11 21 31 41 51 

I I I I I I 

MATRGLCWPG LAGLARAGPA GKARPRRGSA SUJLAGQMWA AGRWGPTFPS SYAGFSADCR 60 

PRSRPSSDSC SVPMTGARGQ GLEWRSPSP PLPLSCSNST RSLLSPLGHQ SFQFDEDDGD 120 

GEDEEDVDDE EDVDEDAHDS EAKVASLRGM BLQGCASTQV ESENNQEEQK QVRLPESRLT 180 

PWEVWFIGKE KEERDRDQLK ALEELNQQLE KRKEMEEREK RKI I AEEKHK EWVQKKNEQK 240 

RKEREQKINK EMEEKAAKEL EKEYDQEKAK EKYQEWLKKK NAEECERKKK EKKNNSKLKY 300 



292 



WO 02/086443 

RRKRK 

Seg ID NO: 2B5 DNA sequence 

Nucleic Acid Accession 8: Eos sequence 

Coding sequence: 1-1746 

1 11 21 31 41 51 

I I I I I I 

ATGCCACTGA AGCATTATCT CCTTTTGCTG GTGGGCTGCC AAGCCTGGGG TGCAGGGTTG 60 

GCCTACCATG GCTSCCCTAG CGAGTGTACC TGCTCCAGGG CCTCCCAGGT GGAGTGCACC 120 

GGGGCACGCA TTGTGGCGGT GCCCACCCCT CTGCCCTGGA ACGCCATGAG CCTGCAGATC 180 

CTCAACACGC ACATCACTGA ACTCAATGAG TCCCCGTTCC TCAATATCTC AGCCCTCATC 240 

GCCCTGAGGA TTGAGAAGAA TGAGCTGTCG CGCATCACGC CTGGGGCCTT CCGAAACCTG 300 

GGCTCGCTGC GCTATCTCAG CCTCGCCAAC AACAAGCTGC AGGTTCTGCC CATCGGCCTC 360 

TTCCAGGGCC TGGACAGCCT TGAGTCTCTC CTTCTGTCCA GTAACCAGCT GTTGCAGATC 420 

CAGCCGGCCC ACTTCTCCCA GTGCAGCAAC CTCAAGGAGC TGCAGTTGCA CGGCAACCAC 480 

CTGGAATACA TCCCTGACGG AGCCTTCGAC CACCTGGTAG GACTCACGAA GCTCAATCTG 540 

GGCAAGAATA GCCTCACCCA CATCTCACCC AGGGTCTTCC AGCACCTGGG CAATCTCCAG 600 

GTCCTCCGGC TGTATGAGAA CAGGCTCACG GATATCCCCA TGGGCACTTT TGATGGGCTT 660 

GTTAACCTGC AGGAACTGGC TCTACAGCAG AACCAGATTG GACTGCTCTC CCCTGGTCTC 720 

TTCCACAACA ACCACAACCT CCAGAGACTC TACCTGTCCA ACAACCACAT CTCCCAGCTG 780 

CCACCCAGCA TCTTCATGCA GCTGCCCCAG CTCAACCGTC TTACTCTCTT TGGGAATTCC 840 

CTGAAGGAGC TCTCTCTGGG GATCTTCGGG CCCATGCCCA ACCTGCGGGA GCTTTGGCTC 900 

TATGACAACC ACATCTCTTC TCTACCCGAC AATGTCTTCA GCAACCTCCG CCAGTTGCAG 960 

GTCCTGATTC TTAGCCGCAA TCAGATCAGC TTCATCTCCC CGGGTGCCTT CAACGGGCTA 1020 

ACGGAGCTTC GGGAGCTGTC CCTCCACACC AACGCACTGC AGGACCTGGA CGGGAATGTC 1080 

TTCCGCATGT TGGCCAACCT GCAGAACATC TCCCTGCAGA ACAATCGCCT CAGACAGCTC 1140 

CCAGGGAATA TCTTCGCCAA CGTCAATGGC CTCATGGCCA TCCAGCTGCA GAACAACCAG 1200 

CTGGAGAACT TGCCCCTCGG CATCTTCGAT CACCTGGGGA AACTGTGTGA GCTGCGGCTG 1260 

TATGACAATC CCTGGAGGTG TGACTCAGAC ATCCTTCCGC TCCGCAACTG GCTCCTGCTC 1320 

AACCAGCCTA GGTTAGGGAC GGACACTGTA CCTGTGTGTT TCAGCCCAGC CAATGTCCGA 1380 

GGCCAGTCCC TCATTATCAT CAATGTCAAC GTTGCTGTTC CAAGCGTCCA TGTCCCTGAG 1440 

GTGCCTAGTT ACCCAGAAAC ACCATGGTAC CCAGACACAC CCAGTTACCC TGACACCACA 1500 

TCCGTCTCTT CTACCACTGA GCTAACCAGC CCTGTGGAAG ACTACACTGA TCTGACTACC 1560 

ATTCAGGTCA CTGATGACCG CAGCGTTTGG GGCATGACCC AGGCCCAGAG CGGGCTGGCC 1620 

ATTGCCGCCA TTGTAATTGG CATTGTCGCC CTGGCCTGCT CCCTGGCTGC CTGCGTCGGC 1680 

TGTTGCTGCT GCAAGAAGAG GAGCCAAGCT GTCCTGATGC AGATGAAGGC ACCCAATGAG 1740 

TGTTAAAGAG GCAGGCTGGA GCAGGGCTGG GGAATGATGG GACTGGAGGA CCTGGGAATT 1800 

TCATCTTTCT GCCTCCACCC CTGGGTCCAT GGAGCTTTCC CGTGATTGCT CTTTCTGGCC 1860 

CTAGATAAAG GTGTGCCTAC CTCTTCCTGA CTTGCCTGAT TCTCCCGTAG AGAAGCAGGT 1920 

CGTGCCGGAC CTTCCTACAA TCAGGAAGAT AGATCCAACT GGCCATGGCA AAAGCCCTGG 198 0 

GGATTTCCGA TTCATACCCC TGGGCTTCCT TCGAGAGGGC TCTTCCTCCA AATCCTCCCC 2040 

ACCTGTCCTC CAAGAACAGC CTTCCCTGCG CCCAGGCCCC CTCCGGGCCT CTGTAGACTC 2100 

AGTTAGTCCA CAGCCTGCTC ACTTCGTGGG AATAGTTCTC CGCTGAGATA GCCCCTCTCG 2160 

CCTAAGTATT ATGTAAGTTG ATTTCCCTTC TTTTGTTTCT CTTGTTTGTG CTATGGCTTG 2220 

ACCCAGCATG TCCCCTCAAA TGAAAGTTCT CCCCTTGATT TTCTGCTCCT GAAGGCAGGG 2280 

TGAGTTCTCT CCTCAAAGAA GACTTCAAAC CATTTAACTG GTTTCTTAAG AGCCGTCAAT 2340 

CAGCCTGGTT TTGGGGATGC TATGAAAGAG AGAAGGAAAA TCATGCCGCT CAGTTCCTGG 2400 

AGACAGAAGA GCCGTCATCA GTGTCTCACT TGTGATTTTT ATCTGGAAAA GGAAGAAACA 2460 

CCCCAGCACA GCAAGCTCAG CCTTTTAGAG AAGGATATTT CCAAACTGCA AACTTTGCTT 2520 

TGAAAAGTTT AGCCCTTTAA GGAATGAAAT CATGTAGAAT TTTGGACTTC TAAAAACATT 2580 

AAAATCAGCT TATTAATACG GGATAGAGAA AGAAATCTGG TGCCTGGGGG TCCCTGTGTT 2640 

CACCCCTAGA GTTTGTTTTA AAATTTTTAA TTGAAGCATG TGAAGTGTAC STGCAGAAAA 2700 

GTGGGAACAT GATAGTGTAT GGCTTGGTGG ATTTTCACAA ACTGAACATA CCTGTGTAAT 2760 

CAGCATCTAG ACCCAGACCC AGAGCATCAC AAATATCCCC CATCCTGGGC TTTTCCCAGA 2820 

GGAGATGGGG GCTTCTGAAG ATGGACTTAC CTGGGACCTG CCCCCCATGA GCCAGGACGG 2880 

TCCCCCCACA GTCAGCCTGT GCAAAGGCCC CGTGGCCAGG GGTGGAGGAG AATATGTGGG 2940 

TGTGGACAGG ATGGGAGACT GTGGCCTGAA CAGGAGATTT TATTATATCT GGAGACCCTG 3000 

AGAGACCCTG AGACCTGGGG CACCATGGCT GGCCAGGTCA GAAGCATCCT GACTGCAGAG 3060 

GTCCGTGCAG CCACACCCTC TTCCCTGCCA GCAAGTTGTC TGCGGCTCAT CGGAGGCCCC 3120 

TCCGCCTGGA GCCTTCTATG GACGTGATAT GCCTGTATCT GTTTTTAATT TTCATTCTTC 3180 

ACTTAGGGGA AGTGAAATCG CTCAGAGATG AGATCCTTTA ATTGAAAACG AAGTGTAACG 3240 

GAATCTAGTG TCTTTCTAAT GTGGTAAAAT TCTCCATCAA CATCACAGTC AGCTGGCAGC 3300 

TGAACTTCAG AATCTCACTT ACAGCAGGCG ACACGGGGGT ACACCGATGG GTCACACTGG 3360 

GTCTGGGGGC TCCCTGGAGC TCCTCCTGCG TGTGGTCTGG TTAGGAGTTG AGTTGTTTGC 3420 

TCCAGGGTTA TTCTCCTCCT CGAGTCACAG TCACACGAAT ACCTGCCTTC TCTGGCTTTC 3480 

CTGCTATACA CATATTCACA TGGCGCTCAA GAAGTTAGGC TCATGGCAAC GTGTGTCTTT 3540 

CTCTGGACAA CTGGCCCAGT TTACAGTGAA ATGGAGAATT TCAGGTCTCC ACGTCTGCCC 3600 

AGGAAAGAAC TTCAGCTGAC TCCACGGGGA TCTGGAAATC CACGACCAAT CCCGATCGGC 3660 

TCTTATTAGC TCCCCGCTCC ACAAGACACC TGTGCTTTGG AAATCCACCA CCAATCCCGA 3720 

TCGGCTCTTA TTAGCTCCCC GCTCCACAAG ACACCTGTGA TCTGGAAATC TACCACCAAT 3780 

CCCGATCGGC TCTTATTAGC TCCCCGCTCC ACAAGACACC TGTGACATCC TCCAGGGCCA 3840 

CAGGAGCACG TGCTGACCAG TTTTCCCTTC CAGTTCCTGC ACAAAAAGTG TCCAGAGGGC 3900 

TGTTTGCAAA CACTAGTGCA CTTTGTAGCT TTTCACCCTC TGTCCCAGGG AATCTAGGAG 3960 

AGATGAGGCC CGTCAGAGTC AAGAGATGTC ATCCCCCCAG GGTCTCCAAG GCATTTCCAC 4020 

ACTATTGGTG GCACCTGGAG GACATGCACC AAGGCTTGCC AGAGCCAACA GGAAGTGAGC 4080 

CCAGAGCATG GCACATGAGC ATCACCCGCT GATGGTGGCC TGCTGTGCCT GGTGCCAACA 4140 

GGGGCATCCC GGCCCGTACC CCTCCAGACA GGAAGCATGG GTTTGCCCAC AGACCTGTCG 4200 

GGTGCTCCTG TGAGTGGCCT CCAGATGTCT TTGTGCATAG GCACAAGTGG GCCAGGGCTG 4260 

GAGGGAGGTG GGAAACCTCA TCATCCGGTG GGCCCTGCCA ATCTTAACCC AGAACCCTTA 4320 

GGTATTCCTG GCAGTAGCCA TGACATTGGA GCACCTTCCT CTCCAGCCAG AGGCTGACCT 4380 

GAGGGCCACT GTCCTCAGAT GACACCACCC AGGAGCACCC TAGGTGAGGG GTGAGGGCCC 4440 

CCTTATGTGA ACCTCTTGCC TCTTCCTTTC TCCCATCAGA GTGGTTGGAT GGAGCCATTG 4500 

GCCTCCTTTT CTTCAGCGGG CCCTTCAACC TCTCTGCACC ATGTTGTCTG GCTGAGGAGC 4560 

TACTAGAAAA GCTGAGTGGA GTCTCCTTTC CAACAGGATG ATGCATTTGC TCAATT CTCA 4620 

GGGCTGGAAT GAGCCGGCTG GTCCCCCAGA AAGCTGGAGT GGGGTACAGA GTTCAGTTTT 4680 

CCTCTCTGTT TACAGCTCCT TGACAGTCCC ACGCCCATCT GGAGTGGGAG CTGGGAGTTA 4740 
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GTGTTGGAGA AGAAACAACA AAAGCCAATT 
TGCACAGATA CTCTTCAAGC ACTGGACGTG 
GGTAGGAGTO CCGCCTCTAC CCACTTGTGA 
GGTGTTCAAT AGGCTGGGAQ TTTTATTTAT 
TTGTCTTGGG CTTTCGTCAT TAAACCAAAG 
TTAGTCTTGG TCATCAGAAC CTCACTTGGT 
GGAAAAAATA AACTCTTCCA TCCCTTAAAG 
TGGGCTGTAT GTATATTGTT CTTCCTCCTT 
AACTTTTCAT GGACACAATT TCCACAACCT 
GAACTTCCAA ACTCAGGAAG TTTGCAGAGA 
AGTTGGTCGA CAGATGTTAG ATGTATCCTA 
GCCCCCAGAT CCCACAGTCA GAACTGAATC 
GGAAGGAAGC CATGGCTGTG GTTCAGAGAG 
CTCCTTCCGC CCCAGGTTTC TTCTTCTCTT 
CTTCATGCTG CCTTCAAAGC TAGATCATGT 
GCCCCAGTGC TTGGCGATGC ATTTACAGAT 
AGCCCTGGTG GGCAGGGTTG GGGGGTCTGT 
GGTGTACAGA ATCAACAATA AATAATATAC 

Seq ID NO: 286 Protein sequence: 
Protein Accession # : NP_570B43 . 1 



PCT/US02/12476 



AGAACCACTA 
GATTCTCTCT 
TGGGGTACAG 
CTCTTCAAAC 
GAAATGGAAG 
ACCATATAGA 
AATAGAATAG 
AGAATTTAGA 
TTCAGATGCT 
GCAGACAGCT 
GCTTTTAGCC 
TGCGTTGTTG 
GGTGGGCTGG 
AAGGAGAGAT 
TTGCCTTGCT 
TTCTAGGCCC 
CTTCTGCTGG 
ATGTAT 



TTTTTAAAAA 
CTAGCCCTCA 
AGGCACTTGC 
TTTGTACAAG 
CCATTCCCCT 
TCAAAAGCTT 
TTTGTCCCTC 
GATACAAGAG 
GATGTAGAGC 
AGAGATAACT 
ATAAACCACT 
GGAAGCCAGC 
CAAGCCACTT 
TGTTCTCACC 
TAGAGAATTA 
TCAGGGTTTT 
ATGCTGCTTG 



GTGCTTACTG 
GCACCCCTGC 
TCTTCTGCAT 
AGCTCATGGC 
GTTGCTCTCC 
TGTAACCACA 
TCATGGGAAT 
TTCTACTTAG 
TATTGGGAAA 
CGGGACCCAG 
CAAAGATTCA 
AGTGGCCTTG 
CCGGGGAAAA 
AACCCGCTGC 
CTGCAAATCA 
GTAGAGTGTG 
TAATCCATTT 



1 
I 

MPLKHYLLLL 
LNTHITELNE 
FQGLDSLESL 
GKNSLTKI8P 
FHNNENLQRL 
YDNHISSLPD 
FRMLANLQNI 
YDNPWRCDSD 
VPSYPETPWY 
IAAIVIGIVA 



11 

I 

VGCQAWGAGL 
SPFLNISALI 
LLSSNQLLQI 
RVFQHLGNLQ 
YLSNNHISQL 
NVFSNLRQLQ 
SLQNNRLRQL 
ILPLRNWLLL 
PDTPSYPDTT 
LACSLAACVG 



21 
I 

AYHGCPSECT 
ALRIEKNELS 
QPAHFSQCSN 
VLRLYENRLT 
PPSIFMQLPQ 
VLILSRNQIS 
PGNIFANVNG 
NQPRLGTDTV 
SVSSTTELTS 
CCCCKKRSQA 



31 
I 

CSRASQVECT 
RITPGAFRNL 
LKELQLHGNH 
DIPMGTFDGL 
LNRLTLFGNS 
FISPGAFNGI* 
LMAIQLQNNQ 
PVCFSPANVR 
PVEDYTDLTT 
VLMQMKAPNE 



41 

I 

GARIVAVPTP 
GSLRYLSLAN 
LEYIPDGAFD 
VNLQELALQQ 
LKELSLGIFG 
TELRELSLHT 
LENLPLGIFD 
GQSLIIINVN 
IQVTDDRSVW 
C 



51 

I 

LPWNAMSLQI 
NKLQVLPIGL 
HLVGLTKLNL 
NQIGLLSPGL 
PMPNLRELWI/ 
NALQDLDGNV 
HLGKLCELRL 
VAVPSVHVPE 
GMTQAQSGLA 



Seq ID NO: 287 DNA sequence 
Nucleic Acid Accession #: NMJ)02362 
Coding sequence: 1..954 



ATGTCTTCTG 
GAGGCCCTGG 
TCCTCCTCCT 
GGTCCTCCCC 
TGGAGGCAAC 
GACGCAGAGT 
CTGCTCCGCA 
ATCAAAAATT 
ATGATCTTTG 
ACCTGCCTGG 
GGCCTTCTGA 
GAAATCTGGG 
GGGGAGCCCA 
CAGGTACCCG 
GAAACCAGCT 
GCCTACCCAT 



11 
I 

AGCAGAAGAG 
GCCTGGTGGG 
CTCCTCTGGT 
AGAGTCCTCA 
CCAATGAGGG 
CCTTGTTCCG 
AGTATCGAGC 
ACAAGCGCTG 
GCATTGACGT 
GCCTTTCCTA 
TAATCGTCCT 
AGGAGCTGGG 
GGAAACTGCT 
GCAGTAATCC 
ATGTGAAAGT 
CCCTGCGTGA 



21 
I 

TCAGCACTGC 
TGCACAGGCT 
CCCTGGCACC 
GGGAGCCTCT 
TTCCAGCAGC 
AGAAGCACTC 
CAAGGAGCTG 
CTTTCCTGTG 
GAAGGAAGTG 
TGATGGCCTG 
GGGCACAATT 
TGTGATGGGG 
CACCCAAGAT 
TGCGCGCTAT 
CCTGGAGCAT 
AGCAGCTTTG 



31 
I 

AAGCCTGAGG 
CCTACTACTG 
CTGGAGGAAG 
GCCTTACCCA 
CAAGAAGAGG 
AGTAACAAGG 
GTCACAAAGG 
ATCTTCGGCA 
GACCCCGCCA 
CTGGGTAATA 
GCAATGGAGG 
GTGTATGATG 
TGGGTGCAGG 
GAGTTCCTGT 
GTGGTCAGGG 
TTAGAGGAGG 



41 
I 

AAGGCGTTGA 
AGGAGCAGGA 
TGCCTGCTGC 
CTACCATCAG 
AGGGGCCAAG 
TGGATGAGTT 
CAGAAATGCT 
AAGCCTCCGA 
GCAACACCTA 
ATCAGATCTT 
GCGACAGCGC 
GGAGGGAGCA 
AAAACTACCT 
GGGGTCCAAG 
TCAATGCAAG 
AAGAGGGAGT 



51 
I 

GGCCCAAGAA 
GGCTGCTGTC 
TGAGTCAGCA 
CTTCACTTGC 
CACCTCGCCT 
GGCTCATTTT 
GGAGAGAGTC 
GTCCCTGAAG 
CACCCTTGTC 
TCCCAAGACA 
CTCTGAGGAG 
CACTGTCTAT 
GGAGTACCGG 
GGCTCTGGCT 
AGTTCGCATT 
CTGA 



Seq ID NO: 288 Protein sequence: 
Protein Accession &: NP 002353.1 



MSSEQKSQHC 
GPPQSPQGAS 
LLRXYRAKEL 
TCLGLSYDGL 
GEPRKLLTQD 
AYPSLREAAL 



11 
I 

KPEEGVEAQE 
ALPTTISFTC 
VTKAEMLERV 
LGNNQIFPKT 
WVQENYLEYR 
LEEEEGV 



21 
I 

EALGLVGAQA 
WRQPNEGSSS 
IKNYKRCFPV 
GLLIIVLGTI 
QVPGSNPARY 



31 

t 

PTTEEQEAAV 
QEEEGPSTSP 
IFGKASESLK 
AMEGDSASEE 
EFLWGPRALA 



41 



51 



SSSSPLVPGT LEEVPAAESA 
DAESLFREAL SNKVDELAHF 
MIFGIDVKEV DPASNTYTLV 
EIWEELGVMG VYDGREHTVY 
ETSYVKVLEH WRVNARVRI 



Seq ID NO: 289 DNA sequence 
Nucleic Acid Accession #: NMJJ02362 
Coding sequence: 46.. 13 44 



CGGCGGCCGC 
GGCGACCTGA 
CATCAGCGCG 
CTCAACAGAC 
TTGACCAGAA 
CAGCCCATCG 
GGCCCCAGCA 
GTTCTACCTG 
AAATCCCATC 



11 

I 

GCCCTGGTTG 
AGCAGGCGCT 
GCAGCAGCAC 
ATAATATTGT 
ATGTGCAGTC 
ATTTGAGTGC 
GTGAAAATCT 
CAGCTGAATT 
TCCTCGATTA 



21 

1 

GGTCCCCACT 
TCCCTGTGTG 
TGCAAAGAAA 
GTTTGGTGAT 
TGTGTCTATT 
ATGCACTGTT 
GGAGGAAGAG 
CCATGGGCTT 
TGTGATGACA 



31 
I 

GCTCTCGGGG 
GCCGAGTCGC 
GAAGACATAA 
TACACATGGA 
ATTGACACAG 
GCACTTCACA 
ACAGAAAACA 
TGGGACAGCT 
ACTTTACTGT 



41 

I 

GCGCCATGGA 
CAACGGTCCA 
ACCTGAGTGT 
CTGAGTTTGA 
AATTAAAGGT 
TTTTCCAGCT 
TAATTGCAGC 
TGGTATACGA 
TTTCAGACAA 



51 

I • 

CGAGGCCGTG 
CGTGGAGGTG 
TAGAAAGCTA 
TGAACCTTTT 
TAAAGACTCA 
GAATGAAGAT 
AAATCACTGG 
TGTGGAAGTC 
GAACGTCAAC 



4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 



60 
120 
180 
240 
300 
360 
420 
480 
540 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
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AGCAACCTCA TCACCTGGAA CCGGGT6GTG CTGCTCCACG GTCCTCCTGG CACTGGAAAA 600 

ACATCCCTGT GTAAAGCGTT AGCCCAGAAA TTGACAATTA GACTTTCAAG CAGGTACCGA 660 

TATGGCCAAT TAATTGAAAT AAACAGCCAC AGCCTCTTTT CTAAGTGGTT TTCGGAAAGT 720 

GGCAAGCTGG TAACCAAGAT GTTTCAGAAG ATTCAGGATT TGATTGATGA TAAAGACGCC 780 

CTGGTGTTCG TGCTGATTGA TGAGGTGGAG AGTCTCACAG CCGCCCGAAA TGCCTGCAGG 840 

GOGGGCACCG AGCCATCAGA TGCCATCCGC GTGGTCAATG CTGTCTTGAC CCAAATTGAT 900 

CAGATTAAAA GGCATTCCAA TGTTGTGATT CTGACCACTT CTAACATCAC CGAGAAGATC 960 

GACGTGGCCT TCGTGGACAQ GGCTGACATC AAGCAGTACA TTGGGCCACC CTCTGCAGCA 1020 

GCCATCTTCA AAATCTACCT CTCTTGTTTG GAAGAACTGA TGAAGTGTCA GATCATATAC 1080 

CCTCGCCAGC AGCTGCTGAC CCTCCGAGAG CTAGAGATGA TTGGCTTCAT TGAAAACAAC 1140 

GTGTCAAAAT TGAGCCTTCT TTTGAATGAC ATTTCAAGGA AGAGCGAGGG CCTCAGCGGC 1200 

CGGGTCCTGA GAAAACTCCC CTTTCTGGCT CATGCGCTGT ATGTCCAGGC CCCCACCGTC 1260 

ACCATAGAGG GGTTCCTCCA GGCCCTGTCT CTGGCAGTGG ACAAGCAGTT TGAAGAGAGA 1320 

AAGAAGCTTG CAGCTTACAT CTGATCCTGG GCTTCCCCAT CTGGTGCTTT TCCCATGGAG 1380 

AACACACAAC CAGTAAGTGA GGTTGCCCCA CACAGCCGTC TCCCAGGGAA TCCCTTCTGC 1440 

AAACCAAACG TTACTTAGAC TGCAAGCTAG AAAGCCACCA AGGCCAGGCT TTGTTAAAAG 1500 

AAGTGTATTC TATTTATGTT GTTTTAAAAT GCATACTGAG AGACAAACAT CTTGTCATTT 1560 

TCACTGTTTG TAAAAGATAA TTCAGATTGT TTGTCTCCTT GTGAAGAACC ATCGAAACCT 1620 

GT T T GTTCCC AGCCCACCCC CAGTGGATGG GATGCATAAT GCCAGCAAGT TTTGTTTAAC 1680 

AGCAAAAAAG GAAGATTAAT GCAGGTGTTA TAGAAGCCAG AAGAGAAACT GTGTCACCCT 1740 

AAAGAAGCAT ATAATCATAG CATTAAAAAT GCACACATTA CTCCAGGTGG AAGGTGGCAA 1800 

TTGCTTTCTG ATATCAGCTC GTTTGATTTA GTGCAAAAAT GTTTTCAAGA CTATTTAATG I860 

GATGTAAAAA AGCCTATTTC TACATTATAC CAACTGAGAA AAAAATGGTC GGTAAAGTGT 1920 

TCTTTCATAA TAAATAATCA AGACATGGTC CCATTTGCAG GAAAAGTGCA GACTCTGAGT 1980 

GTTCCAGGGA AACACATGCT GGACATCCCT TGTAACCCGG TATGGGCGCC CCTGCATTGC 2040 

TGGGATGTTT CTGCCCACGG TTTTGTTTGT GCAATAACGT TATCACATTT CTAATGAGGA 2100 

TTCACATTAA TATAATATAA AATAAATAGG TCAGTTACTG GTCTCTTTCT GCCGAATGTT 2160 
ATGTTTTGCT TTTATCTCAC AGTAAAATAA ATATAATTAA AAA 



Seq ID NO: 290 Protein sequence: 
Protein Accession #t NP_00422B 

1 11 21 31 41 51 

I II I I I 

MDEAVGDLKQ ALPCVAESPT VHVEVHQRGS STAKKEDINL SVRKLLNRHN IVFGDYTWTE 60 
FDEPFLTRNV QSVSIIDTEL KVKDSQPIDL SACTVALHIP QLNEDGPSSE NLEEETENII 120 
AANHWVLPAA EFHGLWDSLV YDVEVKSHLL DYVMTTLLFS DKNVNSNLIT WNRWLLHGP 180 
PGTGKTSUTK ALAQKLTIRL SSRYRYGQLI EINSHSLFSK WFSESGKLVT KMFQKIQDLI 240 
DDKDALVFVL IDEVBSLTAA RNACRAGTEP SDAIRWNAV LTQIDQIKRH SNWILTTSN 300 
ITEKIDVAFV DRADIKQYIG PPSAAAIFKI YLSCLEELMK CQIIYPRQQL LTLRELEMIG 360 
FIENNVSKLS LLLNDISRKS EGLSGRVLRK LPFLAHAIiYV QAPTVTIEGF LQALSLAVDK 420 
QFEERKKIAA YI 



Seq ID NO: 291 DNA sequence 

Nucleic Acid Accession &: NMJ)02658.1 

Coding sequence: 77-1372 

1 11 21 31 41 51 

I I I I I I 

GTCCCCGCAG CGCCGTCGCG CCCTCCTGCC GCAGGCCACC GAGGCCGCCG CCGTCTAGCG 60 

CCCCGACCTC GCCACCATGA GAGCCCTGCT GGCGCGCCTG CTTCTCTGCG TCCTGGTCGT 120 

GAGCGACTCC AAAGGCAGCA ATGAACTTCA TCAAGTTCCA TCGAACTGTG ACTGTCTAAA 180 

TGGAGGAACA TGTGTGTCCA ACAAGTACTT CTCCAACATT CACTGGTGCA ACTGCCCAAA 240 

GAAATTCGGA GGGCAGCACT GTGAAATAGA TAAGTCAAAA ACCTGCTATG AGGGGAATGG 300 

TCACTTTTAC CGAGGAAAGG CCAGCACTGA CACCATGGGC CGGCCCTGCC TGCCCTGGAA 360 

CTCTGCCACT GTCCTTCAGC AAACGTACCA TGCCCACAGA TCTGATGCTC TTCAGCTGGG 420 

CCTGGGGAAA CATAATTACT GCAGGAACCC AGACAACCGG AGGCGACCCT GGTGCTATGT 480 

GCAGGTGGGC CTAAAGCCGC TTGTCCAAGA GTGCATGGTG CATGACTGCG CAGATGGAAA 540 

AAAGCCCTCC TCTCCTCCAG AAGAATTAAA ATTTCAGTGT GGCCAAAAGA . CTCTGAGGCC 600 

CCGCTTTAAG ATTATTGGGG GAGAATTCAC CACCATCGAG AACCAGCCCT GGTTTGCGGC 660 

CATCTACAGG AGGCACCGGG GGGGCTCTGT CACCTACGTG TGTGGAGGCA GCCTCATCAG 720 

CCCTTGCTGG GTGATCAGCG CCACACACTG CTTCATTGAT TACCCAAAGA AGGAGGACTA 780 

CATCGTCTAC CTGGGTCGCT CAAGGCTTAA CTCCAACACG CAAGGGGAGA TGAAGTTTGA 840 

GGTGGAAAAC CTCATCCTAC ACAAGGACTA CAGCGCTGAC ACGCTTGCTC ACCACAACGA 900 

CATTGCCTTG CTGAAGATCC GTTCCAAGGA GGGCAGGTGT GCGCAGCCAT CCCGGACTAT 960 

ACAGACCATC TGCCTGCCCT CGATGTATAA CGATCCCCAG TTTGGCACAA GCTGTGAGAT 1020 

CACTGGCTTT GGAAAAGAGA ATTCTACCGA CTATCTCTAT CCGGAGCAGC TGAAAATGAC 1080 

TGTTGTGAAG CTGATTTCCC ACCGGGAGTG TCAGCAGCCC CACTACTACG GCTCTGAAGT 1140 

CACCACCAAA ATGCTATGTG CTGCTGACCC CCAATGGAAA ACAGATTCCT GCCAGGGAGA 1200 

CTCAGGGGGA CCCCTCGTCT GTTCCCTCCA AGGCCGCATG ACTTTGACTG GAATTGTGAG 1260 

CTGGGGCCGT GGATGTGCCC TGAAGGACAA GCCAGGCGTC TACACGAGAG TCTCACACTT 1320 

CTTACCCTGG ATCCGCAGTC ACACCAAGGA AGAGAATGGC C TGGC CCTCT GAGGGTCCCC 1380 

AGGGAGGAAA CGGGCACCAC CCGCTTTCTT GCTGGTTGTC ATTTTTGCAG TAGAGTCATC 1440 

TCCATCAGCT GTAAGAAGAG ACTGGGAAGA TAGGCTCTGC ACAGATGGAT TTGCCTGTGG 1500 

CACCACCAGG GTGAACGACA ATAGCTTTAC CCTCACGGAT AGGCCTGGGT GCTGGCTGCC 1560 

CAGACCCTCT GGCCAGGATG GAGGGGTGGT CCTGACTCAA CATGTTACTG ACCAGCAACT 1620 

TGTCTTTTTC TGGACTGAAG CCTGCAGGAG TTAAAAAGGG CAGGGCATCT CCTGTGCATG 1680 

GGCTCGAAGG GAGAGCCAGC TCCCCCGACC GGTGGGCATT TGTGAGGCCC ATGGTTGAGA 1740 

AATGAATAAT TTCCCAATTA GGAAGTGTAA GCAGCTGAGG TCTCTTGAGG GAGCTTAGCC 1800 

AATGTGGGAG CAGCGGTTTG GGGAGCAGAG ACACTAACGA CTTCAGGGCA GGGCTCTGAT 1860 

ATTCCATGAA TGTATCAGGA AATATATATG TGTGTGTATG TTTGCACACT TGTTGTGTGG 1920 

GCTGTGAGTG TAAGTGTGAG TAAGAGCTGG TGTCTGATTG TTAAGTCTAA ATATTTCCTT 1980 

AAACTGTGTG GACTGTGATG CCACACAGAG TGGTCTTTCT GGAGAGGTTA TAGGTCACTC 2040 

CTGGGGCCTC TTGGGTCCCC CACGTGACAG TGCCTGGGAA TGTACTTATT CTGCAGCATG 2100 

ACCTGTGACC AGCACTGTCT CAGTTTCACT TTCACATAGA TGTCCCTTTC TTGGCCAGTT 2160 

ATCCCTTCCT TTTAGCCTAG TTCATCCAAT CCTCACTGGG TGGGGTGAGG ACCACTCCTT 2220 

ACACTGAATA TTTATATTTC ACTATTTTTA TTTATATTTT TGTAATTTTA AATAAAAGTG 2280 
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ATCAATAAAA TGTGATTTTT CTGA 



Seq ID NO: 292 Protein sequence: 
Protein Accession #:NP_002649.1 

1 11 21 31 41 51 

I I I I I I 

MRALLARLLL CVLWSDSKG SNELiHQVPSN CDCLNGGTCV SNKYFSNIHW CNCPKKFGGQ 60 
HCEIDKSKTC YEGNGHFYRG KASTDTMGRP CLPWNSATVL QQTYHAHRSD ALQLGLGKHN 120 
YCRNPDNRRR PWCYVQVGLK PLVQECMVHD CADGKKPSSP PEELKFQCGQ KTLRPRFKI I 180 
GGEFTTIENQ PWFAAIYRRH RGGSVTYVCG GSLISPCWVI SATHCFIDYP KKEDYIVYLG 240 
RSRLNSNTQG EMKPEVENLI LHKDYSADTL AHHNDIALLK IRSKEGRCAQ PSRTIQTICL 300 
PSMYNDPQFG TSCEITGFGK ENSTDYLYPE QLKMTWKLI SHRECQQPHY YGSEVTTKML 3 60 
CAADPQWKTD SCQGDSGGPL VCSLQGRMTL TGIVSWGRGC ALKDKPGVYT RVSHFfcPWIR 420 
SHTKEENGLA L 

Seq ID NO: 293 DNA sequence 
Nucleic Acid Accession #: NM_00149B 
Coding sequence: 93.. 2006 



1 11 21 31 41 51 

I I I I I I 

GGCACGAGGC TGAGTGTCCG TCTCGCGCCC GGAAGCGGGC GACCGCCGTC AGCCCGGAGG 60 

AGGAGGAGGA GGAGGAGGAG GAGGGGGCGG CCATGGGGCT GCTGTCCCAG GGCTCGCCGC 120 

TGAGCTGGGA GGAAACCAAG CGCCATGCCG ACCACGTGCG GCGGCACGGG ATCCTCCAGT 180 

TCCTGCACAT CTACCACGCC GTCAAGGACC GGCACAAGGA CGTTCTCAAG TGGGGOGATG 240 

AGGTGGAATA CATGTTGGTA TCTTTTGATC ATGAAAATAA AAAAGTCCGG TTGGTCCTGT 300 

CTGGGGAGAA AGTTCTTGAA ACTCTGCAAG AGAAGGGGGA AAGGACAAAC CCAAACCATC 360 

CTACCCTTTG GAGACCAGAG TATGGGAGTT ACATGATTGA AGGGACACCA GGACAGCCCT 420 

ACGGAGGAAC AATGTCCGAG TTCAATACAG TTGAGGCCAA CATGCGAAAA CGCOGGAAGG 480 

AGGCTACTTC TATATTAGAA GAAAATCAGG CTCTTTGCAC AATAACTTCA TTTCCCAGAT 540 

TAGGCTGTCC TGGGTTCACA CTGCCCGAGG TCAAACCCAA CCCAGTGGAA GGAGGAGCTT 600 

CCAAGTCCCT CTTCTTTCCA GATGAAGCAA TAAACAAGCA CCCTCGCTTC AGTACCTTAA 660 

CAAGAAATAT CCGACATAGG AGAGGAGAAA AGGTTGTCAT CAATGTACCA ATATTTAAGG 720 

ACAAGAATAC ACCATCTCCA TTTATAGAAA CATTTACTGA GGATGATGAA GCTTCAAGGG 780 

CTTCTAAGCC GGATCATATT TACATGGATG CCATGGGATT TGGAATGGGC AATTGCTGTC 840 

TCCAGGTGAC ATTCCAAGCC TGCAGTATAT CTGAGGCCAG ATACCTTTAT GATCAGTTGG 900 

CTACTATCTG TCCAATTGTT ATGGCTTTGA GTGCTGCATC TCCCTTTTAC CGAGGCTATG 960 

TGTCAGACAT TGATTGTCGC TGGGGAGTGA TTTCTGCATC TGTAGATGAT AGAACTCGGG 1020 

AGGAGCGAGG ACTGGAGCCA TTGAAGAACA ATAACTATAG GATCAGTAAA TCCCGATATG 1080 

ACTCAATAGA CAGCTATTTA TCTAAGTGTG GTGAGAAATA TAATGACATC GACTTGACGA 1140 

TAGATAAAGA GATCTACGAA CAGCTGTTGC AGGAAGGCAT TGATCATCTC CTGGCCCAGC 1200 

ATGTTGCTCA TCTCTTTATT AGAGACCCAC TGACACTGTT TGAAGAGAAA ATACACCTGG 1260 

ATGATGCTAA TGAGTCTGAC CATTTTGAGA ATATTCAGTC CACAAATTGG CAGACAATGA 1320 

GATTTAAGCC CCCTCCTCCA AACTCAGACA TTGGATGGAG AGTAGAATTT CGACCCATGG 1380 

AGGTGCAATT AACAGACTTT GAGAACTCTG CCTATGTGGT GTTTGTGGTA CTGCTCACCA 1440 

GAGTGATCCT TTCCTACAAA TTGGATTTTC TCATTCCACT GTCAAAGGTT GATGAGAACA 1500 

TGAAGGTAGC ACAGAAAAGA GATGCTGTCT TGCAGGGAAT GTTTTATTTC AGGAAAGATA 1560 

TTTGCAAAGG TGGCAATGCA GTGGTGGATG GTTGTGGCAA GGCCCAGAAC AGCACGGAGC 1620 

TCGCTGCAGA GGAGTACACC CTCATGAGCA TAGACACCAT CATCAATGGG AAGGAAGGTG 1680 

TGTTTCCTGG ACTGATCCCA ATTCTGAACT CTTACCTTGA AAACATGGAA GTGGATGTGG 1740 

ACACCAGATG TAGTATTCTG AACTACCTAA AGCTAATTAA GAAGAGAGCA TCTGGAGAAC 1800 

TAATGACAGT TGCCAGATGG ATGAGGGAGT TTATCGCAAA CCATCCTGAC TACAAGCAAG 1860 

ACAGTGTCAT AACTGATGAA ATGAATTATA GCCTTATTTT GAAGTGTAAC CAAATTGCAA 1920 

ATGAATTATG TGAATGCCCA GAGTTACTTG GATCAGCATT TAGGAAAGTA AAATATAGTG 1980 

GAAGTAAAAC TGACTCATCC AACTAGACAT TCTACAGAAA GAAAAATGCA TTATTGACGA 2040 

ACTGGCTACA GTACCATGCC TCTCAGCCCG TGTGTATAAT ATGAAGACCA AATGATAGAA 2100 

CTGTACTGTT TTCTGGGCCA GTGAGCCAGA AATTGATTAA GGCTTTCTTT GGTAGGTAAA 2160 

TCTAGAGTTT ATACAGTGTA CATGTACATA GTAAAGTATT TTTGATTAAC AATGTATTTT 2220 

AATAACATAT CTAAAGTCAT CATGAACTGG CTTGTACATT TTTAAATTCT TACTCTGGAG 2280 

CAACCTACTG TCTAAGCAGT TTTGTAAATG TACTGGTAAT TGTACAATAC TTGCATTCCA 2340 

GAGTTAAAAT GTTTACTGTA AATTTTTGTT CTTTTAAAGA CTACCTGGGA CCTGATTTAT 2400 

TGAAATTTTT CTCTTTAAAA ACATTTTCTC TCGTTAATTT TCCTTTGTCA TTTCCTTTGT 2460 

TGTCTACATT AAATCACTTG AATCCATTGA AAGTGCTTCA AGGGTAATCT TGGGTTTCTA 2520 

GCACCTTATC TATGATGTTT CTTTTGCAAT TGGAATAATC ACTTGGTCAC CTTGCCCCAA 2580 
GCTTTCCCCT CTGAATAAAT ACCCATTGAA CTCTGAAAAA AAAAAAAAAA AAAA 



Seq ID NO: 294 Protein sequence: 
Protein Accession #: NPJJ01489 



1 11 21 31 41 51 

I I I I I I 

MGLLSQGSPL SWEETKRHAD HVRRHGILQF LHIYHAVKDR HKDVLKWGDE VEYMLVSFDH 60 

ENKKVRLVLS GEKVLETLQE KGERTNPNHP TLWRPEYGSY MIEGTPGQPY GGTMSEFNTV 120 

EANMRKRRKE ATSILEENQA LCTITSFPRL GCPGFTLPEV KPNPVEGGAS KSLFFPDEAI 180 

NKHPRFSTLT RNIRHRRGEK WINVPIFKD KNTPSPFIET FTEDDEASRA SKPDHIYMDA 240 

MGFGKGNCCIi QVTFQACSIS EARYLYDQLA TICPIVMALS AASPFYRGYV SDIDCRWGVI 300 

SASVDDRTRE ERGLEPLKNN NYRISKSRYD SIDSYLSKCG BKYNDIDLTI DKEIYEQLLQ 360 

EGIDHLIiAQH VAHLFIRDPL TLFEEKIHLD DANESDHFEN IQSTNWQTMR FKPPPPNSDI 420 

GWRVEFRPME VQLTDFENSA YWFWLLTR VILSYKLDFL IPLSKVDENM KVAQKRDAVL 480 

QGMFYFRKDI CKGGNAWDG CGKAQNSTEL AAEEYTLMSI DTIINGKEGV FPGLIPILNS 540 

YLENMEVDVD TRCSILNYLK LIKKRASGEL MTVARWMREF IANHPDYKQD SVITDEMNYS 600 
LILKCNQIAN ELCECPELLG SAFRKVKYSG SKTDSSN 



296 



5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

Seg ID KOt 295 dna sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 247-816 



PCT/US02/12476 



AGTGTTCGGC 
GGCCAAACGG 
CCTAGGGGGC 
GGGAGGCGCC 
GAAACAATQA 
CCCAGGGAAT 
AAACAAGGAG 
ATGACAGGAC 
TTCAGCAAAG 
ACCAGCAGTT 
CAACGAGAAA 
CAAAAATATG 
AAGCGATTTT 
AAGCACCTTA 
CACACCCCAA 
TTCTACAATG 
CTTCCAGAGG 
TTGAAAGGAT 
TAACCCATTA 



11 

I 

TGGGGCAGOC 
GATCGGTGCT 
ACATTTCCCA 
ACAACTTCAC 
CCGATAAAAC 
GTGACAGTCC 
CAGGAGACAG 
ATGCTATTCC 
ATAGGATGAT 
TCTCTGGAGA 
TTAATGCTGA 
AAAAAATCTT 
TTGAATCCAT 
AGAAGAAACT 
ATGCATAATC 
GAGCAGGATA 
CTAAGAAATT 
AACTTGTGTT 
GGTAAATACT 



21 

i 

AOGCTGTGGC 
TCTGGTGAGA 
CAACTCCCAG 
TGCCATTTTG 
AGAGAAGGTG 
TTCGTATCAG 
CCTTATTGCA 
ACCCAGCCAA 
GCAGAAACCT 
TGACCTAGAA 
TATAAAACGT 
CGAAATGCTT 
CATCAAGGAA 
GAAACGTATG 
TCGTTAATGA 
TTGCTGAAGT 
TCTGTTAGTA 
TTGGTTATTT 
ATTACAGTCG 



31 
I 

TGGCTACTTC 
CGCCTCCCCA 
AGGGCAGGTT 
TGAGGTGCCG 
GCTGTAGATC 
AAAAGGCAGA 
GGCTCTGCCA 
TTGGATTCTC 
GGTAGCAATG 
TGCAGAGAAA 
AAATTAGTGA 
GAAGGAGTGC 
GCAGCAAGAT 
ATTTGAGAAT 
TTGAGGAGAG 
CTCCTGGCAT 
AAAGATGTTC 
TGTATTCCCA 
TGGTTTCTGC 



41 

I 

CCTTCCTCCC 
TGCACATCAC 
TCTAGAAAGT 
CGGTCTCTCC 
CTGAAAC7GT 
GGATGGCCCT 
TGTCCAAAGA 
AGATTGATGA 
CACCTGTGGG 
CAGCCTCCTC 
AGGAACTCCG 
AAGGACCTAC 
GTATGAGACG 
ACTTGTCCCT 
AAAAGGATCA 
ATGTTACCGA 
TTTTTCCCAA 
CCTGTGCTGG 
A 



51 
I 

ATCCCCCTTG 
TCCCAGGTGC 
GCCACCAGTG 
TCCAGCAAGG 
GTTTAAACGT 
GTTGGCAAGG 
AAAGAAGCTT 
CTTCACTGGT 
AGGAAACGTT 
TCCCAAAAGC 
ATGCGTTGGA 
TGCAGTCAGG 
AGACTTTGTT 
GGAGGATTAT 
GATTGCTGTT 
ATCAAATAGC 
AGCATTTTAT 
TAGATATTAT 



Seq ID NO: 296 Protein sequence: 
Protein Accession #: Eos sequence 

21 



I 11 21 31 41 51 

I I I I I I 

MTDKTEKVAV DPETVPKRPR ECDSPSYQKR QRMALLARKQ GAGDSLIAGS AMSKEKKLMT 
GHAIPPSQLD SQIDDFTGFS KDRMMQKPGS NAPVGGNVTS SFSGDDLECR ETASSPKSQR 
EINADIKRKL VKELRCVGQK YEKIFEMLEG VQGPTAVRKR FFESIIKEAA RCMRRDFVKH 
LKKKLKRMI 

Seq ID NO: 297 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 247-815 



AGTGTTCGGC 
GGCCAAACGG 
CCTAGGGGGC 
GGGAGGCGCC 
GAAACAATGA 
CCCAGGGAAT 
AAACAAGGAG 
ATGACAGGAC 
TTCAGCAAAG 
ACCAGCAGTT 
CAACAAGAAA 
CAAAAATATG 
AAACGATTTT 
AAGCACCTTA 
CACACCCCAA 
TTCTACAATG 
CTTCCAGAGG 
TTGAAAGGAT 
TAACCCATTA 



11 
I 

TGGGGCAGGC 
GATCGGTGCT 
ACATTTCCCA 
ACAACTTCAC 
CCGATAAAAC 
GTGACAGTCC 
CAGGAGACAG 
ATGCTATTCC 
ATAGGATGAT 
TCTCTGGAGA 
TTAATGCTGA 
AAAAAATCTT 
TTGAATCCAT 
AGAAGAAACT 
ATGCATAATC 
GAGCAGGATA 
CTAAGAAATT 
AACTTGTGTT 
GGTAAATACT 



21 
I 

ACGCTGTGGC 
TCTGGTGAGA 
CAACTCCCAG 
TGCCATTTTG 
AGAGAAGGTG 
TTCGTATCAG 
CCTTATTGCA 
ACCCAGCCAA 
GCAGAAACCT 
TGACCTAGAA 
TATAAAACGT 
CGAAATGCTT 
CATCAAGGAA 
GAAACGTATG 
TCATTAATGA 
TTGCTGAAGT 
TCTGTTAGTA 
TTGGTTATTT 
ATTACAGTCG 



31 
I 

TGGCTACTTC 
CGCCTCCCCA 
AGGGCAGGTT 
TGAGGTGCCG 
GCTGTAGATC 
AAAAGGCAGA 
GGCTCTGCCA 
TTGGATTCTC 
GGTAGCAATG 
TGCAGAGAAA 
AAATTAGTGA 
GAAGGAGTGC 
GCAGCAAGAT 
ATTTGAGAAT 
TTGAGGAGAG 
CTCCTGGCAT 
AAAGATGTTC 
TGTATTCCCA 
TGGTTTCTGC 



41 
I 

CCTTCCTCCC 
TGCACATCAC 
TCTAGAAAGT 
CCGTCTCTCC 
CTGAAACTGT 
GGATGGCCCT 
TGTCCAAAGA 
AGATTGATGA 
CACCTGTGGG 
CAGCCTCCTC 
AGGAACTCCG 
AAGGACCTAC 
GTATGAGACG 
ACTTGTCCCT 
AAAAGGATCA 
ATGTTACCGA 
TTTTTCCCAA 
CCTGTGCTGG 
A 



51 
I 

ATCCCCCTTG 
TCCCAGGTGC 
GCCACCAGTG 
TCCAGCAAGG 
GTTTAAACGT 
GTTGGCAAGG 
AAAGAAGCTT 
CTTCACTGGT 
AGGAAACGTT 
TCCCAAAAGC 
ATGCGTTGGA 
TGCAGTCAGG 
AGACTTTGTT 
GGAGGATTAT 
GATTGCTGTT 
ATCAACTGGC 
AGCGTTTTAT 
TAGATATTAT 



Seq ID NO: 298 Protein sequence: 
Protein Accession #: Eos sequence 



41 



51 



1 11 21 31 

I III I ! 

MTDKTEKVAV DPETVPKRPR ECDSPSYQKR QRMALLARKQ GAGDSLIAGS AMSKEKKLMT 
GHAIPPSQLD SQIDDPTGPS KDRMMQKPGS NAPVGGNVTS SFSGDDLECR ETASSPKSQQ 
EINADIKRKL VKELRCVGQK YEKIFEMLEG VQGPTAVRKR FFESIIKEAA RCMRRDFVKH 
LKKKLKRMI 

Seq ID NO: 299 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 247-815 



I 

AGTGTTCGGC 
GGCCAAACGG 
CCTAGGGGGC 
GGGAGGCGCC 
GAAACAATGA 
CCCAGGGAAT 
AAACAAGGAG 
TGACAGGACA 



11 

I 

TGGGGCAGGC 
GATCGGTGCT 
ACATTTCCCA 
ACAACTTCAC 
CCGATAAAAC 
GTGACAGTCC 
CAGGAGACAG 
TGCTATTCCA 



21 
I 

ACGCTGTGGC 
TCTGGTGAGA 
CAACTCCCAG 
TGCCATTTTG 
AGAGAAGGTG 
TTCGTATCAG 
CCTTATTGCA 
CCCAGCCAAT 



31 
I 

TGGCTACTTC 
CGCCTCCCCA 
AGGGCAGGTT 
TGAGGTGCCG 
GCTGTAGATC 
AAAAGGCAGA 
GGCTCTGCCA 
TGGATTCTCA 



41 
I 

CCTTCCTCCC 
TGCACATCAC 
TCTAGAAAGT 
CCGTCTCTCC 
CTGAAACTGT 
GGATGGCCCT 
TGTCCAAAGC 
GATTGATGAC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



60 
120 
180 



6.0 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



60 
120 
180 



51 
I 

ATCCCCCTTG 
TCCCAGGTGC 
GCCACCAGTG 
TCCAGCAAGG 
GTTTAAACGT 
GTTGGCAAGG 
AAAGAGCTTA 
TTCACTGGTT 



60 
120 
180 
240 
300 
360 
420 
480 



297 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

TCAGCAAAGA TAGGATGATG CAGAAACCTG GTAGCAATGC ACCTGTGGGA GGAAACGTTA 
CCAGCAGTTT CTCTGGAGAT GACCTAGAAT GCAGAGAAAC AGCCTCCTCT CCCAAAAGCC 
AACAAGAAAT TAATGCTGAT ATAAAACGTA AATTAGTGAA GGAACTCCGA TGCGTTGGAC 
AAAAATATGA AAAAATCTTC GAAATGCTTG AAGGAGTGCA AGGACCTACT GCAGTCAGGA 
AACGATTTTT TGAATCCATC ATCAAGGAAG CAGCAAGATG TATGAGACGA GACTTTGTTA 
AGCACCTTAA GAAGAAACTG AAACGTATGA TTTGAGAATA CTTGTCCCTG GAGGATTATC 
ACACCCCAAA TGCATAATCT CATTAATGAT TGAGGAGAGA AAAGGATCAG ATTGCTGTTT 
TCTACAATGG AGCAGGATAT TGCTGAAGTC TCCTGGCATA TGTTACCGAA TCAACTGGCC 
TTCCAGAGGC TAAGAAATTT CTGTTAGTAA AAGATGTTCT TTTTCCCAAA QCQTTTTATT 
TGAAAGGATA ACTTGTGTTT TGGTTATTTT GTATTCCCAC CTGTGCTGGT AGATATTATT 
AACCCATTAG GTAAATACTA TTACAGTCGT GGTTTCTGCA 

Seq ID NO: 300 Protein sequence: 
Protein Accession #« Eos sequence 



31 



41 



51 



1 11 21 

I I I I I 1 

MTDKTEKVAV DPETVFKRPR ECDSPSYQKR QRMALLARKQ GAGDSLIAGS AMSKAKKLMT 
GHAIPPSQLD SQIDDFTGFS KDRMMQKPGS HAPVGGNVTS SFSGDDLECR ETASSPKSQQ 
EINADIKRKL VKELRCVGQK YEKIFEMLEG VQGPTAVRKR FFESIIKEAA RCMRRDFVKH 
LKKKLKRMI 

Seq ID NO: 301 ONA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 247-812 



AGTGTTCGGC 
GGCCAAACGG 
CCTAGGGGGC 
GGGAGGCGCC 
GAAACAATGA 
CCCAGGGAAT 
AAACAAGGAG 
TGACAGGACA 
TCAGCAAAGA 
CCAGCAATTT 
AACAAGAAAT 
AATATGAAAA 
GATTTTTTGA 
ACCTTAAGAA 
CCCCAAATGC 
ACAATGGAGC 
CAGAGGCTAA 
AAGGATAACT 
CCATTAGGTA 



11 
I 

TGGGGCAGGC 
GATCGGTGCT 
ACATTTCCCA 
ACAACTTCAC 
CCGATAAAAC 
GTGACAGTCC 
CAGGAGACAG 
TGCTATTCCA 
TGGGATGATG 
CTCTGGAGAT 
TAATGCTGAT 
AATCTTCGAA 
ATCCATCATC 
GAAACTGAAA 
ATAATCTCAT 
AGGATATTGC 
GAAATTTCTG 
TGTGTTTTGG 
AATACTATTA 



21 
I 

ACGCTGTGGC 
TCTGGTGAGA 
CAACTCCCAG 
TGCCATTTTG 
AGAGAAGGTG 
TTCGTATCAG 
CCTTATTGCA 
CCCAGCCAAT 
CAGAAACCTG 
GACCTAGAAT 
ATAAAATGTC 
ATGCTTGAAG 
AAGGAAGCAG 
CGTATGATTT 
TAATGATTGA 
TGAAGTCTCC 
TTAGTAAAAG 
TTATTTTGTA 
CAGTCGTGGT 



31 

.1 

TGGCTACTTC 
CGCCTCCCCA 
AGGGCAGGTT 
TGAGGTGCCG 
GCTGTAGATC 
AAAAGGCAGA 
GGCTCTGCCA 
TGGATTCTCA 
GTAGCAATGC 
GCAGAGGAAT 
AAGTAGTGAA 
GAGTGCAAGG 
CAAGATGTAT 
GAGAATACTT 
GGAGAGAAAA 
TGGCATATGT 
ATGTTCTTTT 
TTCCCACCTG 
TTCTGCA 



41 
I 

CCTTCCTCCC 
TGCACATCAC 
TCTAGAAAGT 
CCGTCTCTCC 
CTGAAACTGT 
GGATGGCCCT 
TGTCCAAAGA 
GATTGATGAC 
ACCTGTGGGA 
AGCCTCCTCT 
GGAAATCCGA 
ACCTACTGCA 
GAGACGAGAC 
GTCCCTGGAG 
GGATCAGATT 
TACCGAATCA 
TCCCAAAGCG 
TGCTGGTAGA 



51 
I 

ATCCCCCTTG 
TCCCAGGTGC 
GCCACCAGTG 
TCCAGCAAGG 
GTTTAAACGT 
GTTGGCAAGG 
AAAGAGCTTA 
TTCACTGGTT 
GGAAATGTTA 
CCCAAAAGCC 
TGCCTTGGAC 
GTCAGGAAAC 
TTTGTTAAGC 
GATTATCACA 
GCTGTTTTCT 
ACTGGCCTTC 
TTTTATTTGA 
TATTATTAAC 



Seq ID NO: 302 Protein sequence: 
Protein Accession #: Eos sequence 

21 



1 11 21 31 41 51 

I I I I I 1 

MTDKTEKVAV DPETVFKRPR ECDSPSYQKR QRMALLARKQ GAGDSLIAGS AMSKEKKLMT 
GHAIPPSQLD SQIDDFTGFS KDGMMQKPGS NAPVGGNVTS NFSGDDLECR GIASSPKSQQ 
EINADIKCQV VKEIRCLGQY EKIFEMLEGV QGPTAVRKRF FESIIKEAAR CMRRDFVKHL 
KKKLKRMI 

Seq ID NO: 303 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 247-815 



AGTGTTCGGC 
GGCCAAACAG 
CCTAGGGGGC 
GGGAGGCGCC 
GAAACAATGA 
CCCAGGGAAT 
AAACAAGGAG 
TGACAGGACA 
TCAGCAAAGA 
CCAGCAGTTT 
AACAAGAAAT 
AAAAATATGA 
AACGATTTTT 
AGCACCTTAA 
ACACCCCAAA 
TCTACAATGG 
TTCCAGAGGC 
TGAAAGGATA 
AACCCATTAG 



11 
I 

TGGGACAGGC 
GATCGGTGCT 
ACATTTCCCA 
ACAACTTCAC 
CCGATAAAAC 
GTGACAGTCC 
CAGGAGACAG 
TGCTATTCCA 
TAGGATGATG 
CTCTGGAGAT 
TAATGCTGAT 
AAAAATCTTC 
TGAATCCATC 
GAAGAAACTG 
TGCATAATCT 
AGCAGGATAT 
TAAGAAATTT 
ACTTGTGTTT 
GTAAATACTA 



21 
I 

ACGCTGTGGC 
TCTGGTGAGA 
CAACTCCCAG 
TGCCATTTTG 
AGAGAAGGTG 
TTCGTATCAG 
CCTTATTGCA 
CCCAGCCAAT 
CAGAAACCTG 
GACCTAGAAT 
ATAAAACGTA 
GAAATGCTTG 
ATCAAGGAAG 
AAACGTATGA 
CGTTAATGAT 
TGCTGAAGTC 
CTGTTAGTAA 
TGGTTATTTT 
TTACAGTCGT 



31 
I 

TGGCTACTTC 
CGTCTCCCCA 
AGGGCAGGTT 
TGAGGTGCCG 
GCTGTAGATC 
AAAAGGCAGA 
GGCTCTGCCA 
TGGATTCTCA 
GTAGCAATGC 
GCAGAGAAAC 
AATTAGTGAA 
AAGGAGTGCA 
CAGCAAGATG 
TTTGAGAATA 
TGAGGAGAGA 
TCCTGGCATA 
AAGATGTTCT 
GTATTCCCAC 
GGTTTCTGCA 



41 
I 

CCTTCCTTCC 
TGCACATCAC 
TCTAGAAAGT 
CCGTCTCTCC 
CTGAAACTGT 
GGATGGCCCT 
TGTCCAAAGC 
GATTGATGAC 
ACCTGTGGGA 
AGCCTCCTCT 
GGAACTCCGA 
AGGACCTACT 
TATGAGACGA 
CTTGTCCCTG 
AAAGGATCAG 
TGTTACCGAA 
TTTTCCCAAA 
CTGTGCTGGT 



51 
I 

ATCCCCCTTG 
TCCCAGATGC 
GCCACCAGTG 
TCCAGCAAGG 
GTTTAAACGT 
GTTGGCAAGG 
AAAGAGCTTA 
TTCACTGGTT 
GGAAACGTTA 
CCCAAAAGCC 
TGCGTTGGAC 
GCAGTCAGGA 
GACTTTGTTA 
GAGGATTATC 
ATTGCTGTTT 
TCAACTGGCC 
GCGTTTTATT 
AGATATTATT 



540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



PCTYUS02/12476 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
B40 
900 
960 
1020 
1080 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



Seq ID NO: 304 Protein sequence: 
Protein Accession ft: Eos sequence 



298 



WO 02/086443 

1 11 21 31 

I I I I 

MTDKTEKVAV DPBTVFKRPR ECDSPSYQKR QRMALLARKQ 
GHAIPPSQLD SQIDDFTGFS KDRMMQKPGS NAPVGGNVTS 
EINADIKRKL VKELRCVGQK YEKIFEMLEG VQGPTAVRKR 
LKKKLKRMI 

Seg ID NO i 305 DNA sequence 
Nucleic Acid Accession ft: Eos sequence 
Coding sequence t 87-589 

1 11 21 31 41 51 

I I I I I I 

CGTGGAGGCA GCTAGCGCGA GGCTGGGGAG CGCTGAGCCG CGCGTCGTGC CCTGCGCTGC 60 

CCAGACTAGC GAACAATACA GTCAGGATGG CTAAAGGTGA CCCCAAGAAA CCAAAGGGCA 120 

AGATGTCCGC TTATGCCTTC TTTGTGCAGA CATGCAGAGA AGAACATAAG AAGAAAAACC 180 

CAGAGGTCCC TGTCAATTTT GCGGAATTTT CCAAGAAGTG CTCTGAGAGG TGGAAGACGA 240 

TGTCCGGGAA AGAGAAATCT AAATTTGATG AAATGGCAAA GGCAGATAAA GTGCGCTATG 300 

ATCGGGAAAT GAAGGATTAT GGACCAGCTA AGGGAGGCAA GAAGAAGAAG GATCCTAATG 360 

CTCCCAAAAG GCCACCGTCT GGATTCTTCC TGTTCTGTTC AGAATTCCGC CCCAAGATCA 420 

AATCCACAAA CCCCGGCATC TCTATTGGAG ACGTGGCAAA AAAGCTGGGT GAGATGTGGA 480 

ATAATTTAAA TGACAGTGAA AAGCAGCCTT ACATCACTAA GGCGGCAAAG CTGAAGGAGA 540 

AGTATGAGAA GGATGTTGCT GACTATAAGT CGAAAGGAAA GTTTGATGGT GCAAAGGGTC 600 

CTGCTAAAGT TGCCCGGAAA AAGGTGGAAG AGGAAGATGA AGAAGAGGAG GAGGAAGAAG 660 

AGGAGGAGGA GGAGGAGGAG GATGAATAAA GAAACTGTTT ATCTGTCTCC TTGTGAATAC 720 

TTAGAGTAGG GGAGCGCCGT AATTGACACA TCTCTTATTT GAGAAGTGTC TGTTGCCCTC 780 

ATTAGGTTTA ATTACAAAAT TTGATCACGA TCATATTGTA GTCTCTCAAA GTGCTCTAGA 840 

AATTGTCAGT GGTTTACATG AAGTGGCCAT GGGTGTCTGG AGCACCCTGA AACTGTATCA 900 

AAGTTGTACA TATTTCCAAA CATTTTTAAA ATGAAAAGGC ACTCTCGTGT TCTCCTCACT 960 

CTGTGCACTT TGCTGTTGGT GTGACAAGGC ATTTAAAGAT GTTTCTGGCA TTTTCTTTTT 1020 

ATTTGTAAGG TGGTGGTAAC TATGGTTATT GGCTAGAAAT CCTGAGTTTT CAACTGTATA 1080 

TATCTATAGT TTGTAAAAAG AACAAAACAA CCGAGACAAA CCCTTGATGC TCCTTGCTCG 1140 

GCGTTGAGGC TGTGGGGAAG ATGCCTTTTG GGAGAGGCTG TAGCTCAGGG CGTGCACTGT 1200 

GAGGCTGGAC CTGTTGACTC TGCAGGGGGC ATCCATTTAG CTTCAGGTTG TCTTGTTTCT 1260 

GTATATAGTG ACATAGCATT CTGCTGCCAT CTTAGCTGTG GACAAAGGGG GGTCAGCTGG 1320 

CATGAGAATA TTTT T T TTTT TAAGTGOGGT AGTTTTTAAA CTGTTTGTTT TTAAACAAAC 1380 

TATAGAACTC TTCATTGTCA GCAAAGCAAA GAGTCACTGC ATCAATGAAA GTTCAAGAAC 1440 

CTCCTGTACT TAAACACGAT TCGCAACGTT CTGTTATTTT TTTTGTATGT TTAGAATGCT 1500 

GAAATGTTTT TGAAGTTAAA TAAACAGTAT TACATTTTTA AAACTCTTCT CTATTATAAC 1560 

AGTCAATTTC TGACTCACAG CAGTGAACAA ACCCCCACTC CATTGTATTT GGAGACTGGC 1620 

CTCCCTATAA ATGTGGTAGC TTCTTTTATT ACTCAGTGGC CAGCTCACTT AGGGCTGAGA 1680 

TGAAGGAGAG GGCTACTTGA AGCTACTGTG TGATTTTGTT TGTGTCTGAG TGGCATTCAG 1740 

ATGAAGTCTG GAGGAGTTAG GAGAACGACA TAGGCAAGGT TCAGCAGCCT TCCAAGGTAT 1800 

AGGAAGGTGG GTGATTAGGA CTGAGGCTAT CTAGGTTTAA CTTTTGTCCC ACCTCCACCC 1860 

CCTATTTTGT GGGGCCAAAT GCATTGCTAA ACAGCAATTT CAGAGTGTAT GGTGTGTCAA 1920 

AAATTAAGGC CTTATTGTTT TTCTCTTTCA CCCCTACCCC CCGTGCTCCT GGCACATATC 1980 

ACATTATTTG TGGTGCCCAA CATTTGGGGT CTTGAGCCTG CTGCTGGTCT CCTGGATGCC 2040 

AGTGAGGGTA TGTGGGATGG GGTGGTGGGG TAGGGGACGG TATCCTTTTT TTGCTCCTAC 2100 

TTGGAAACAC CAAACACCCC AAGGAAGATG ATAGGCTCCA TCTTGGGCCA CCTGAGCTAT 2160 

AGGGCAGGCT AATGGAATCA ACCATTTCTG AGCACTAAAT GTATCATGAA AAGTTGAATG 2220 

GCCTGCTCAT AAGTTTAGCT CATTCACTGG AAATGTAGAT TGATGTTCAA TGTTAAACTG 2280 

GAAGGAGCTT GGTTTGTGTG TCAGTGGTTA TATTAGTGGG TAGTGTAACA TTTTATCCAG 2340 

. GTTGGGGTGA GGGGAGATGG CCACAGTAGC AAGTGGTGAC ACTAAATACC ATTTTGAAGG 2400 

CTGATGTGTA TATACATCAT TACTGTCCGT AGCAATGAAG GATACAGTAC TGTGTTGTGG 2460 

GTGAGTGTTG CTATTGCCCA GCATTAATAT TTGGGTGTGT ATGTTTGAGG CTATGAAACA 2520 

CGCAGGAGTG TTTTTGTGCT ATTAATTTTA AGAGAAAGCA GCTTTTTCTT AAAATTCACT 2580 

GTTGAGAAAC TTGCATGTCT GGAGGCGGTG TCCTCTCCGC CCTGTCGGGT CCTGGATGAG 2640 

TACGAGTTAT GGTCACGGTC ACAGCCTGAT CTCTTATGTG TTCATAGCCA TTCGCTCTCC 2700 

CATCAGAACT GTTTGTCCTG AATGTGTTCC TCTAGTTCTA GAAAATGACC ACTAATTTAA 2760 

AAAACTCGGT TGTGAGGTTT GCCCAGAGGC ACTTGTTCCA GAATTTCCCC TCCTGCTTCA 2820 

GCCATGTCCT TGTCACTTGG CATTCTAAGC TAAAGCTTTA GCTTCCCAAT TCGTGATGTG 2880 

CTAGGCCAAG ATTCGGGAGC TGTTGCCAGC CTCGTCAAAT ATGGAAGAGA AACAACCTGC 2940 

GGTCAAAAGG GAGTGATTTG TTAAGTGGTG CGCGTCTATC TCATAACTAG ATGTACCAAC 3000 

CAGGGAAGGG CCAAGGATGG AAAGGGGTAA CTTTTGTGCT TCCAAAGTAG CTAAGCAGAA 3060 

GTGGGGGAGC AGTTTAGCCA GATGATCTTT GATTAGGCAA ACATTGAGTT TTAAAGAGGC 3120 

TGTCAAGTTG AGGCCACTTG GTCCATTAGC TGGGGCAGCA AGATCACTAC TCAACGTTTT 3180 

CACACTGTGG CAAGATTGCT CTTCTAGTGG AATAATGCCC TAGTTTCTCT GAGATGATGT 3240 

AAGTGGCATG ATGTTACCTA AGGCTTAGGC TTAGCTTGAT TTCTGGGCCC ACTGTCTGTG 3300 

TTCTTAAGAT GCCAACCTGT TGCTTTTTTT TTTTTTTTCC CCCATTTAAA AGGATAGTAC 3360 

CTACTCCCTC TAACCACCTC ACCCCATTCT TGAATGACAT TTTATCCTTC GGAAAGAACA 3420 

AGGCTGTGAT GTAGTGACTA TTGTCTGTGT CTCCTGTGTG TGTCTGTTCT TGTCACAAAT 3480 
GTATTTGGGG ACGTTGGATG CATTCATTTT CTGTAATAAA G 

Seq ID NO: 306 Protein sequence: 
Protein Accession ft: NP_005333.1 

1 11 21 31 41 51 

I I I I I I 

MAKGDPKKPK GKMSAYAFFV QTCREERKKK NPEVPVNPAB FSKKCSERWK TMSGKBKSKP 60 

DEMAKADKVR YDREMKDYGP AKGGKKKKDP NAPKRPPSGF FLFCSEFRPK IKSTNPGISI 120 

GDVAKKLGEM WNNLNDSEKQ PYITKAAKLK EKYEKDVADY KSKGKFDGAK GPAKVARKKV 180 
EEEDEEEBEE EEEEEEEEDE 

Seq ID NO: 307 DNA sequence 
Nucleic Acid Accession ft: NM_022342 
Coding sequence: 1..2178 



41 51 
I I 

GAGDSLIAGS AMSKAKKLMT 60 
SFSGDDfcECR ETASSPKSQQ 120 
FFESIIKEAA RCMRRDFVKH 180 



299 



WO 02/086443 



1 11 21 31 41 SI 

I I I I I I 

ATGGGTACTA GGAAAAAAGT TCATGCATTT GTCCGTGTCA AACCCACCGA TGACTTTGCT 60 

CATGAAATGA TCAGATAOGG AGATGACAAA AGAAGCATTG ATATTCACTT AAAAAAAGAC 120 

ATTCGGAGAG GAGTTGTCAA TAACCAACAG ACAGACTGGT CGTTTAAGTT GGATGGAGTT 180 

TTCACGATG CCTCCCAGGA CTTGGTTTAT GAGACAGTTG CAAAGGATGT GGTTTCTCAG 240 

CCCTCGATG GCTATAATGG CACCATCATG TGTTATGGGC AGACGGGAGC TGGCAAGACA 300 

ACACCATGA TGGGGGCAAC TGAGAATTAC AAGCACCGGG GGATCCTCCC TCGTGCCCTG 360 

AGCAGGTTT TTAGGATGAT OGAAGAAOGC CCCACACATG CCATCACTGT GOGTGTTTCC 420 

ACTTGGAAA TCTATAATGA GAGCCTGTTT GATCTCCTGT CCACTCTGCC CTATGT TGGA 480 

CCTCAGTCA CACCAATGAC CATCGTGGAA AACCCTCAAG GAGTCTTCAT TAAGGGCTTG 540 

CAGTTCACC TCACAAGTCA GGAGGAGGAT GCATTCAGCC TCCTTTTTGA GGGTG AGAC C 600 

ACAGGATTA TAGCCTCCCA CACTATGAAC AAAAACTCTT CCAGATCACA CTGCATTTTC 660 

CCATCTACT TAGAGGCCCA TTCCCGGACC TTATCAGAGG AAAAGTACAT CACTTCCAAA 720 

TTAACTTGG TGGATCTGGC AGGCTCAGAG AGGCTGGGGA AGTCTGGGTC TGAGGGCCAA 780 

TCCTGAAGG AAGCCACCTA CATCAACAAA TOGCTCTCAT TCCTGGAGCA GGCCATCATT B4t> 

CCCTTGGGG ACCAGAAGCG GGACCACATC CCCTTTCGGC AGTGCAAGCT CACCCACGCT 900 

TGAAGGACT CGTTAGGGGG AAACTGCAAT ATGGTCCTCG TGACAAACAT CTATGGAGAA 960 

CTGCCCAGT TAGAAGAAAC GCTATCTTCA CTGAGATTTG CCAGCAGGAT GAAGCTAGTC 1020 

CCACTGAGC CTGCCATCAA TGAAAAGTAT GATGCTGAGA GAATGGTCAA GAACCTGGAG 1080 

AGGAACTAG CACTACTCAA GCAGGAGCTG GCTATCCATG ACAGCCTGAC CAACCGCACC 1140 

TTGTGACCT ATGACCCCAT GGATGAAATC CAGATTGCTG AGATCAACTC CCAGGTGCGG 1200 

GGTACCTGG AGGGGACACT GGACGAGATC GACATAATCA GCCTTAGACA GATCAAGGAG 1260 

TGTTCAACC AGTTCCGGGT GGTTCTGAGC CAACAGGAAC AGGAAGTGGA GTCCACTTTG 1320 

GCAGGAAGT ACACCCTCAT TGACAGGAAT GACTTTGCAG CCATTTCTGC TATCCAGAAG 1380 

CGGGGCTTG TGGATGTTGA TGGCCACCTA GTGGGTGAGC CTGAAGGACA AAACTTTGGA 1440 

TCGGAGTCG CCCCTTTCTC TACCAAACCT GGGAAGAAAG CCAAGTCCAA GAAGACATTC 1500 

AAGAGCCAC TCAGGCCCGA CACCCCACCC TCCAAACCAG TGGCCTTTGA GGAGTTTAAG 1560 

ATGAGCAAG GTAGTGAGAT CAACCGAATT TTCAAAGAAA ACAAATCCAT CTTGAATGAA 1620 

GGAGGAAAA GGGCCAGCGA GACCACACAG CACATCAATG CCATCAAGCG GGAGATTGAT 1680 

TGACCAAGG AGGCCCTGAA TTTCCAGAAG TCACTACGGG AGAAGCAAGG CAAGTACGAA 1740 

ACAAGGGGC TGATGATCAT CGATGAGGAA GAATTCCTGC TGATCCTCAA GCTCAAAGAC 1800 

TCAAGAAGC AGTACCGCAG CGAGTACCAG GACCTGCGTG ACCTCAGGGC TGAGATCCAG 1860 

ATTGCCAGC ACCTAGTGGA TCAGTGTCGC CACCGCCTGC TCATGGAATT TGACATCTGG 1920 

ACAATGAGT CCTTTGTCAT CCCTGAGGAC ATGCAGATGG CACTGAAGCC AGGCGGCAGC 1980 

TCCGGCCAG GCATGGTCCC TGTGAACAGG ATTGTGTCTC TGGGAGAAGA TGACCAGGAC 2040 

AATTCAGCC AGCTGCAGCA GAGGGTGCTT CCTGAGGGCC CTGATTCCAT CTCCTTCTAC 2100 

ATGCCAAAG TCAAGATAGA GCAGAAGCAT AATTACTTGA AAACCATGAT GGGCCTCCAG 2160 
AGGCACATA GAAAATAG 



Seq ID NO: 308 Protein sequence: 
Protein Accession #: NPJ)71737 

1 11 21 31 41 51 

I I I I I I 

MGTRKKVHAF VRVKPTDDFA KEMIRYGDDK RSIDIHLKKD IRRGWNNQQ TDWSFKLDGV 60 
LHDASQDLVY ETVAKDWSQ ALDGYNGTIM CYGQTGAGKT YTMMGATENY KHRGILPRAL 120 
QQVFRMIEER PTHAITVRVS YLEIYNESLF DLLSTLPYVG PSVTPMTIVE NPQGVFIKGL 180 
SVHLTSQEED AFSLLFEGET NRIIASHTMN KNSSRSHCIF TIYLEAHSRT LSEEKYITSK 240 
INLVDLAGSE RLGKSGSEGQ VLKEATY1NK SLSFLEQAI I ALGDQKRDHI PFRQCKLTHA 300 
LKDSLGGNCN MVLVTNIYGE AAQLEETLSS LRFASRMKLV TTEPAINEKY DAERMVKNLE 360 
KELALLXQEL AIHDSLTNRT FVTYDPMDEI QIAEINSQVR RYLEGTLDEI DIISLRQIKE 420 
VFNQFRWLS QQEQEVESTL RRKYTLIDRN DFAAISAIQK AGLVDVDGHL VGEPEGQNFG 480 
LGVAPFSTKP GKKAKSKKTF KEPLRPDTPP SKPVAFEEFK NEQGSEINRI FKENKSILNE 540 
RRKRASETTQ HINAIKREID VTKEAIiNFQK SLREKQGKYE NKGLMI IDEE EFLLILKLKD 600 
LKKQYRSEYQ DLRDLRAEIQ YCQHLVDQCR HRLLMEFDIW YNESFVIPED MQMALKPGGS 660 
IRPGMVPVNR IVSLGEDDQD KFSQLQQRVL PEGPDSISFY NAKVKIEQKH NYLKTMMGLQ 720 
QAHRK 

Seq ID NO: 309 DNA sequence 

Nucleic Acid Accession #: CAT cluster 



1 11 21 31 41 51 

I I I I I I 

TTTTTTTTTT TTTTTTTTAA TGCCTGCTGT CATGCTCTGT CTACCAGGGT GAATTTCCAA 60 

AAATTTCTGC ATAGCAATTT TAGCCAAAAC TATATATGTT CTGGGGAGGA TAGGCATAGG 120 

CACATTGAAG ACCAAAGGAA AGAGTGAAGA AGTGTAGTTG GGTCATTGTG AATGGATGTT 180 

TAGATTGTCA AGAAAAGTGG GCCAGAGGCC CCACCTCACA CTAGGACGGC AATTGCCTCT 240 

CATTAGTATC TCAGGCACCA TGGGTCTTAT TTGGTGTCAT AAGAAACACC CTCAACAAAG 300 

TAATGAACCC TCAGCCTCCA GCTTCTCTTC TTCGGGATTC TTCTTAGGGC CTCCTTTTTC 360 

CTTTTATGTT TCCAGTACCC TGAATTTCTT ATTCCCATCC CCCATTAAAA TCTGCTTCAA 420 

AGAAAAAACA AGAAGGACAC ATTCACTTTA AGATCCAAAT GAATGATAAG AGCTTAAAAC 480 
ATTATACTTA TCAGTATTAT TTGCATTTTT ATAGAAACCA AAACCATATT TCAACAAC 

Seq ID NO: 310 DNA sequence 

Nucleic Acid Accession ft: NMJU8622.2 

Coding sequence: 1.-1140 

1 11 21 31 41 51 

I I I I I I 

ATGGCGTGGC GAGGCTGGGC GCAGAGAGGC TGGGGCTGCG GCCAGGCGTG GGGTGCGTCG 60 
GTGGGCGGCC GCAGCTGCGA GGAGCTCACT GCGGTCCTAA CCCCGCCGCA GCTCCTCGGA 120 
CGCAGGTTTA ACTTCTTTAT TCAACAAAAA TGCGGATTCA GAAAAGCACC CA GGAA GGTT 180 
GAACCTCGAA GATCAGACCC AGGGACAAGT GGTGAAGCAT ACAAGAGAAG TGCTTTGATT 240 
CCTCCTGTGG AAGAAACAGT CTTTTATCCT TCTCCCTATC CTATAAGGAG TCTCATAAAA 300 
CCTTTATTTT TTACTGTTGG GTTTACAGGC TGTGCATTTG GATCAGCTGC TATTTGGCAA 360 
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TATGAATCAC TGAAATCCAG GGTCCAGAGT TATTTTGATG 6TATAAAAGC TGATTGGTTG 420 

GATAGCATAA GACCACAAAA AGAAGGAGAC TTCAGAAAGG AGATTAACAA GTGGTGGAAT 480 

AACCTAAGTG ATGGCCAGCG GACTGTGACA GGTATTATAG CTGCAAATGT CCTTGTATTC 540 

TGTTTATGGA GAGTACCTTC TCTGCAGCGG ACAATGATCA GATATTTCAC ATCGAATCCA 600 

GCCTCAAAGG TCCTTTGTTC TCCAATGTTG CTGTCAACAT TCAGTCACTT CTCCTTATTT 660 

CACATGGCAG CAAATATGTA TGTTTTGTGG AGCTTCTCTT CCAGCATAGT GAACATTCTG 720 

GGTCAAGAGC AGTTCATGGC AGTGTACCTA TCTGCAGGTG TTATTTCCAA TTTTGTCAGT 780 

TACCTGGGTA AAGTTGCCAC AGGAAGATAT GGACCATCAC TTGGTGCATC TGGTG CCAT C 840 

ATGACAGTCC TCGCAGCTGT CTGCACTAAG ATCCCAGAAG GGAGGCTTGC CATTATTTTC 900 

CTTCCGATGT TCACGTTCAC AGCAGGGAAT GCCCTGAAAG CCATTATCGC CATGGATACA 960 

GCAGGAATGA TCCTGGGATG GAAATTTTTT GATCATGCGG CACATCTTGG GGGAGCTCTT 1020 

TTTGGAATAT GGTATGTTAC TTACGGTCAT GAACTGATTT GGAAGAACAG GGAGCCGCTA 1080 
GTGAAAATCT GGCATGAAAT AAGGACTAAT GGCCCCAAAA AAGGAGGTGG CTCTAAGTAA 



Seq ID NO: 311 Protein sequence: 
Protein Accession ft: NP_061092.2 

1 11 21 31 41 51 

I I I I I 1 

MAWRGWAQRG WGCGQAWGA3 VGGRSCEELT AVLTPPQLLG RRFNFFIQQK CGFRKAPRKV 60 

EPRRSDPGTS GEAYKRSALI PPVEETVFYP SPYPIRSLIK PLFFTVGFTG CAFGSAAIWQ 120 

YESLKSRVQ9 YFBGIKADWL DSIRPQKEGD FRKEINKWWN NLSDGQRTVT GIIAANVLVF 180 

CLWRVPSLQR TMIRYFTSNP ASKVLCSPML LSTFSHFSLF HMAANMYVLW SFSSSIVNIL 240 

GQEQFMAVYL SAGVISNFVS YLGKVATGRY GPSLGASGAI MTVIAAVCTK IPEGRIAIIF 300 

LPMFTFTAGN ALKAIIAMDT AGMILGWKFF DHAAHLGGAL FGIWYVTYGH ELIWKNREPL 360 
VKIWHEIRTN GPKKGGGSK 

Seq ID NO: 312 DNA sequence 
Nucleic Acid Accession #: NM_000625 
Coding sequence: 195.. 3656 

1 11 21 31 41 51 

I I I I I 1 

CTCTCGGCCA CCTTTGATGA GGGGACTGGG CAGTTCTAGA CAGTCCCGAA GTTCTCAAGG 60 

CACAGGTCTC TTCCTGGTTT GACTGTCCTT ACCCCGGGGA GGCAGTGCAG CCAGCTGCAA 120 

GCCCCACAGT GAAGAACATC TGAGCTCAAA TCCAGATAAG TGACATAAGT GACCTGCTTT 180 

GTAAAGCCAT AGAGATGGCC TGTCCTTGGA AATTTCTGTT CAAGACCAAA TTCCACCAGT 240 

ATGCAATGAA TGGGGAAAAA GGCATCAACA ACAATGTGGA GAAAGCCCCC TGTGCCACCT 300 

CCAGTCCAGT GACACAGGAT GACCTTCAGT ATCACAACCT CAGCAAGCAG CAGAATGAGT 360 

CCCCGCAGCC CCTCGTGGAG ACGGGAAAGA AGTCTCCAGA ATCTCTGGTC AAGCTGGATG 420 

CAACCCCATT GTCCTCCCCA CGGCATGTGA GGATCAAAAA CTGGGGCAGC GGGATGACTT 480 

TCCAAGACAC ACTTCACCAT AAGGCCAAAG GGATTTTAAC TTGCAGGTCC AAATCTTGCC 540 

TGGGGTCCAT TATGACTCCC AAAAGTTTGA CCAGAGGACC CAGGGACAAG CCTACCCCTC 600 

CAGATGAGCT TCTACCTCAA GCTATCGAAT TTGTCAACCA ATATTACGGC TCCCTCAAAG 660 

AGGCAAAAAT AGAGGAACAT CTGGCCAGGG TGGAAGCGGT AACAAAGGAG ATAGAAACAA 720 

CAGTAACCTA CCAACTGACG GGAGATGAGC TCATCTTCGC CACCAAGCAG GCCTGGCGCA 780 

ATGCCCCACG CTGCATTGGG AGGATCCAGT GGTCCAACCT GCAGGTCTTC GATGCCCGCA 840 

GCTGTTCCAC TGCCCGGGAA ATGTTTGAAC ACATCTGCAG ACACGTGCGT TACTCCACCA 900 

ACAATGGCAA CATCAGGTCG GCCATCACCG TGTTCCCCCA GCGGAGTGAT GGCAAGCACG 960 

ACTTCCGGGT GTGGAATGCT CAGCTCATCC GCTATGCTGG CTACCAGATG CCAGATGGCA 1020 

GCATCAGAGG GGACCCTGCC AACGTGGAAT TCACTCAGCT GTGCATCGAC CTGGGCTGGA 1080 

AGCCCAAGTA CGGCCGCTTC GATGTGGTCC CCCTGGTCCT GCAGGCCAAT GGCCGTGACC 1140 

CTGAGCTCTT CGAAATCCCA CCTGACCTTG TGCTTGAGGT GGCCATGGAA CATCCCAAAT 1200 

ACGAGTGGTT TCGGGAACTG GAGCTAAAGT GGTACGCCCT GCCTGCAGTG GCCAACATGC 1260 

TGCTTGAGGT GGGCGGCCTG GAGTTCCCAG GGTGCCCCTT CAATGGCTGG TACATGGGCA 1320 

CAGAGATCGG AGTCCGGGAC TTCTGTGATG TCCAGCGCTA CAACATCCTG GAGGAAGTGG 1380 

GCAGGAGAAT GGGCCTGGAA ACGCACAAGC TGGCCTCGCT CTGGAAAGAC CAGGCTGTCG 1440 

TTGAGATCAA CATTGCTGTG CTCCATAGTT TCCAGAAGCA GAATGTGACC ATCATGGACC 1500 

ACCACTCGGC TGCAGAATCC TTCATGAAGT ACATGCAGAA TGAATACCGG TCCCGTGGGG 1560 

GCTGCCCGGC AGACTGGATT TGGCTGGTCC CTCCCATGTC TGGGAGCATC ACCCCCGTGT 1620 

TTCACCAGGA GATGCTGAAC TACGTCCTGT CCCCTTTCTA CTACTATCAG GTAGAGGCCT 1680 

GGAAAACCCA TGTCTGGCAG GACGAGAAGC GGAGACCCAA GAGAAGAGAG ATTCCATTGA 1740 

AAGTCTTGGT CAAAGCTGTG CTCTTTGCCT GTATGCTGAT GCGCAAGACA ATGGCGTCCC 1800 

GAGTCAGAGT CACCATCCTC TTTGCGACAG AGACAGGAAA ATCAGAGGCG CTGGCCTGGG 1860 

ACCTGGGGGC CTTATTCAGC TGTGCCTTCA ACCCCAAGGT TGTCTGCATG . GATAAGTACA 1920 

GGCTGAGCTG CCTGGAGGAG GAACGGCTGC TGTTGGTGGT GACCAGTACG TTTGGCAATG 1980 

GAGACTGCCC TGGCAATGGA GAGAAACTGA AGAAftTCGCT CTTCATGCTG AAAGAGCTCA 2040 

ACAACAAATT CAGGTACGCT GTGTTTGGCC TCGGCTCCAG CATGTACCCT CGGTTCTGCG 2100 

CCTTTGCTCA TGACATTGAT CAGAAGCTGT CCCACCTGGG GGCCTCTCAG CTCACCCCGA 2160 

TGGGAGAAGG GGATGAGCTC AGTGGGCAGG AGGACGCCTT CCGCAGCTGG GCCGTGCAAA 2220 

CCTTCAAGGC AGCCTGTGAG ACGTTTGATG TCCGAGGCAA ACAGCACATT CAGATCCCCA 2280 

AGCTCTACAC CTCCAATGTG ACCTGGGACC CGCACCACTA CAGGCTCGTG CAGGACTCAC 2340 

AGCCTTTGGA CCTCAGCAAA GCCCTCAGCA GCATGCATGC CAAGAACGTG TTCACCATGA 2400 

GGCTCAAATC TCGGCAGAAT CTACAAAGTC CGACATCCAG OCGTGCCACC ATOCTGGTGG 2460 

AACTCTCCTG TOAGGATGGC CAAGGCCTGA ACTACCTGCC GGGGGAGCAC CTTGGGGTTT 2520 

GCCCAGGCAA CCAGCCGGCC CTGGTCCAAG GCATCCTGGA GCGAGTGGTG GATGGCCCCA 2580 

CACCCCACCA GGCAGTGCGC CTGGAGGCCC TGGATGAGAG TGGCAGCTAC TGGGTCAGTG 2640 

ACAAGAGGCT GCCCCCCTGC TCACTCAGCC AGGCCCTCAC CTACTTCCTG GACATCACCA 2700 

CACCCCCAAC CCAGCTGCTG CTCCAAAAGC TGGCCCAGGT GGCCACAGAA GAGCCTGAGA 2760 

GACAGAGGCT GGAGGCCCTG TGCCAGCCCT CAGAGTACAG CAAGTGGAAG TTCACCAACA 2820 

GCCCCACATT CCTGGAGGTG CTAGAGGAGT TCCCGTCCCT GCGGGTGTCT GCTGGCTTCC 2880 

TGCTTTCCCA GCTCCCCATT CTGAAGCCCA GGTTCTACTC CATCAGCTCC CCCCGGGATC 2940 

ACACGCCCAC GGAGATCCAC CTGACTGTGG CCGTGGTCAC CTACCACACC CGAGATGGCC 3000 

AGGGTCCCCT GCACCACGGC GTCTGCAGCA CATGGCTCAA CAGCCTGAAQ CCCCAAGACC 3060 

CAGTGCCCTG CTTTGTGCGG AATGCCAGCG GCTTCCACCT CCCCGAGGAT CCCTCCCATC 3120 



301 
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CTTGCATCCT CATCGGGCCT GGCACAGGCA TGGCGCCCTT CCGCAGTTTC TGGCAGCAAC 3180 

GGCTCCATGA CTCCCAGCAC AAGGGAGTGC GGGGAGGCCG CATGACCTTG GTGTTTGGGT 3240 

GCCGCOGCCC AGATGAGGAC CACATCTACC AGGAGGAGAT GCTGGAGATG GCCCAGAAGG 3300 

GGGTGCTGCA TGCGGTGCAC ACAGCCTATT CCCGCCTGCC TGGCAAGCCC AAGGTCTATG 3360 

TTCAGGACAT CCTGCGGCAG CAGCTGGCCA GCGAGGTGCT CCGTGTGCTC CACAAGGAGC 3420 

CAGGCCACCT CTATGTTTGC GGGGATGTGC GCATGGCCCX3 GGACGTGGCC CACACCCTGA 3480 

AGCAGCTGGT GGCTGCCAAG CTGAAATTGA ATGAGGAGCA GGTCGAGGAC TATTTCTTTC 3540 

AGCTCAAGAG CCAGAAGCGC TATCACGAAG ATATCTTTGG TGCTGTATTT CCTTACGAGG 3600 

CGAAGAAGGA CAGGGTGGCG GTGCAGCCCA GCAGCCTGGA GATGTCAGCG CTCTGAGGGC 3660 

CTACAGGAGG GGTTAAAGCT GCCGGCACAG AACTTAAGGA TGGAGCCAGC TCTGCATTAT 3720 

CTGAGGTCAC AGGGCCTGGG GAGATGGAGG AAAGTGATAT CCCCCAGCCT CAAGTCTTAT 3780 

TTCCTCAACG TTGCTCCCCA TCAAGCCCTT TACTTGACCT CCTAACAAGT AGCACCCTGG 3840 
ATTGATCGGA GCCTC 

Seq ID NO i 313 Protein sequence: 
Protein Accession #: NP^OOOeie 



1 11 21 31 41 51 

I I I 1 I I 

MACPWKFLFK TKFHQYAMNG EKGINNNVEK APCATSSPVT QDDLQYHNLS KQQNESPQPL 60 

VETGKKSPES LVKLDATPLS SPRHVRIKNW GSGMTFQDTL HHKAKGILTC RSKSCLGSIM 120 

TPKSLTRGPR DKPTPPDELL PQAIEFVNQY YGSLKEAKIE EHLARVEAVT KEIETTVTYQ 180 

LTGDELIFAT KQAWRNAPRC IGRIQWSNLQ VFDARSCSTA REMFEHICRH VRYSTNNGNI 240 

RSAITVFPQR SDGKHDPRVW NAQLIRYAGY QMPDGSIRGD PANVEFTQLC IDLGWKPKYQ 300 

RFDWPLVLQ ANGRDPELFE IPPDIiVIiEVA MEHPKYEWFR ELELKWYALP AVANMLLEVG 360 

GLEFPGCPPN GWYMGTEIGV RDFCDVQRYN ILEEVGRRKG LETHKLASLW KDQAWEINI 420 

AVLHSFQKQN VTIMDHHSAA ESFMKYMQNE YRSRGGCPAD WIWLVPPMSG SITPVFHQEM 480 

LNYVLSPFYY YQVEAWKTHV WQDEKRRPKR REIPLKVLVK AVLFACMLMR KTMASRVRVT 540 

ILFATETGKS EALAWDLGAL FSCAFNPKW CMDKYRLSCL EEERLLLWT STFGNGDCPG 600 

NGEKLKKSLF MLKELNNKFR YAVFGLGSSM YPRFCAFAHD IDQKLSHLGA SQLTPMGEGD 660 

ELSGQEDAFR SWAVQTFKAA CETFDVRGKQ HIQIPKLYTS NVTWDPHHYR LVQDSQPLDL 720 

• SKALSSMHAK NVFTMRLKSR QNLQSPTSSR ATILVELSCE DGQGLNYLPG EHLGVCPGNQ 780 

PALVQGILER WDGPTPHQA VRLEALDESG SYWVSDKRLP PCSLSQALTY FLDITTPPTQ 840 

LLLQKLAQVA TEEPERQRLE ALCQPSEYSK WKFTNSPTFL EVLEEFPSLR VSAGFLLSQL 900 

PILKPRFYSI SSPRDHTPTE IHLTVAWTY HTRDGQGPLH HGVCSTWLNS LKPQDPVPCF 960 

VRNASGFHLP EDPSHPCILI GPGTGIAPFR SFWQQRLHDS QHKGVRGGRM TLVFGCRRPD 1020 

EDHIYQEEML EMAQKGVLHA VHTAYSRLPG KPKVYVQDIL RQQLASEVLR VLHKEPGHLY 1080 

VCGDVRMARD VAHTLKQLVA AKLKLNEEQV EDYFFQLKSQ KRYHEDIFGA VFPYEAKKDR 1140 
VAVQPSSLEM SAL 

Seq ID NO: 314 DNA sequence 
Nucleic Acid Accession #: XM_087254 
Coding sequence: 47.. 2332 



1 11 21 31 41 51 

I I I I I I 

AGAGTACGTG TTTACAGATA AAACTGGTAC ACTGACAGAA AATGAGATGC AGTTTCGGGA 60 

ATGTTCAATT AATGGCATGA AATACCAAGA AATTAATGGT AGACTTGTAC CCGAAGGACC 120 

AACACCAGAC TCTTCAGAAG GAAACTTATC TTATCTTAGT AGTTTATCCC ATCTTAACAA 180 

CTTATCCCAT CTTACAACCA GTTCCTCTTT CAGAACCAGT CCTGAAAATG AAACTGAACT 240 

AATTAAAGAA CATGATCTCT TCTTTAAAGC AGTCAGTCTC TGTCACACTG TACAGATTAG 300 

CAATGTTCAA ACTGACTGCA CTGGTGATGG TCCCTGGCAA TCCAACCTGG CACCATCGCA 360 

GTTGGAGTAC TATGCATCTT CACCAGATGA AAAGGCTCTA GTAGAAGCTG CTGCAAGGAT 420 

TGGTATTGTG TTTATTGGCA ATTCTGAAGA AACTATGGAG GTTAAAACTC TTGGAAAACT 480 

GGAACGGTAC AAACTGCTTC ATATTCTGGA ATTTGATTCA GATCGTAGGA GAATGAGTGT 540 

AATTGTTCAG GCACCTTCAG GTGAGAAGTT ATTATTTGCT AAAGGAGCTG AGTCATCAAT 600 

TCTCCCTAAA TGTATAGGTG GAGAAATAGA AAAAACCAGA ATTCATGTAG ATGAATTTGC 660 

TTTGAAAGGG CTAAGAACTC TGTGTATAGC ATATAGAAAA TTTACATCAA AAGAGTATGA 720 

GGAAATAGAT AAACGCATAT TTGAAGCCAG GACTGCCTTG CAGCAGCGGG AAGAGAAATT 780 

GGCAGCTGTT TTCCAGTTCA TAGAGAAAGA CCTGATATTA CTTGGAGCCA CAGCAGTAGA 840 

AGACAGACTA CAAGATAAAG TTCGAGAAAC TATTGAAGCA TTGAGAATGG CTGGTATCAA 900 

AGTATGGGTA CTTACTGGGG ATAAACATGA AACAGCTGTT AGTGTGAGTT TATCATGTGG 960 

CCATTTTCAT AGAACCATGA ACATCCTTGA ACTTATAAAC CAGAAATCAG ACAGCGAGTG 1020 

TGCTGAACAA TTGAGGCAGC TTGCCAGAAG AATTACAGAG GATCATGTGA TTCAGCATGG 1080 

GCTGGTAGTG GATGGGACCA GCCTATCTCT TGCACTCAGG GAGCATGAAA AACTATTTAT 1140 

GGAAGTTTGC AGAAATTGTT CAGCTGTATT ATGCTGTOGT ATGGCTCCAC TGCAGAAAGC 1200 

AAAAGTAATA AGACTAATAA AAATATCACC TGAGAAACCT ATAACATTGG CTGTTGGTGA 1260 

TGGTGCTAAT GAGGTAAGCA TGATACAAGA AGCCCATGTT GGCATAGGAA TCATGGGTAA 1320 

AGAAGGAAGA CAGGCTGCAA GAAACAGTGA CTATGCAATA GCCAGATTTA AGTTCCTCTC 1380 

CAAATTGCTT TTTGTTCATG GTCATTTTTA TTATATTAGA ATAGCTACCC TTGTACAGTA 1440 

TTTTTTTTAT AAGAATGTGT GCTTTATCAC ACCCCAGTTT TTATATCAGT TCTACTGTTT 1500 

GTTTTCTCAG CAAACATTGT ATGACAGCGT GTACCTGACT TTATACAATA TTTGTTTTAC 1560 

TTCCCTACCT ATTCTGATAT ATAGTCTTTT GGAACAGCAT GTAGACCCTC ATGTGTTACA 1620 

AAATAAGCCC ACCCTTTATC GAGACATTAG TAAAAACCGC CTCTTAAGTA TTAAAACATT 1680 

TCTTTATTGG ACCATCCTGG GCTTCAGTCA TGCCTTTATT TTCTTTTTTG GATCCTATTT 1740 

ACTAATAGGG AAAGATACAT CTCTGCTTGG AAATGGCCAG ATGTTTGGAA ACTGGACATT 1800 

TGGCACTTTG GTCTTCACAG TCATGGTTAT TACAGTCACA GTAAAGATGG CTCTGGAAAC 1860 

TCATTTTTGG ACTTGGATCA ACCATCTCGT TACCTGGGGA TCTATTATAT TTTATTTTGT 1920 

ATTTTCCTTG TTTTATGGAG GGATTCTCTG GCCATTTTTG GGCTCCCAGA ATATGTATTT 1980 

TGTGTTTATT CAGCTCCTGT CAAGTGGTTC TGCTTGGTTT GCCATAATCC TCATGGTTGT 2040 

TACATGTCTA TTTCTTGATA TCATAAAGAA GGTCTTTGAC CGACACCTCC ACCCTACAAG 2100 

TACTGAAAAG GCACAGCTTA CTGAAACAAA TGCAGGTATC AAGTGCTTGG ACTCCATGTG 2160 

CTGTTTCCOG GAAGGAGAAG CAGCGTGTGC ATCTGTTGGA AGAATGCTGG AACGAGTTAT 2220 

AGGAAGATGT AGTCCAACCC ACATCAGCAG ATCATGGAGT GCATCGGATC CTTTCTATAC 2280 

CAACGACAGG AGCATCTTGA CTCTCTCCAC AATGGACTCA TCTACTTGTT AAAGGGGCAG 2340 
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TAGTACTTTG TGGGAGCCAG TTCACCTCCT TTCCTAAAAT TCAGTGTGAT CACCCTGTTA 2400 

ATGGCCACAC TAGCTCTGAA ATTAATTTCC AAAATCTTTG TAGTAGTTCA TACCCACTCA 2460 

GAGTTATAAT GGCAAACAAA CAGAAAGCAT TAGTACAAGC CCCTCCCAAC ACCCTTAATT 2520 

TGAATCTGAA CATGTTAAAA TTTGAGAATA AAGAGACATT TTTCATCTCT TTGTCTGGTT 2580 

TGTCCCTTQT GCTTATGGGA CTCCTAATGG CATTTCAGTC TGTTGCTGAG GCCATTATAT 2640 

TTTAATATAA ATGTAGAAAA AAGAGAGAAA TCTTAGTAAA GAGTATTTTT TAGTA T TAGC 2700 

TTGATTATTG ACTCTTCTAT TTAAATCTGC TTCTGTAAAT TATGCTGAAA GTTTGCCTTG 2760 

AGAACTCTAT TTTTTTATTA GAGTTATATT TAAAGCTTTT CATGGGAAAA GTTAATGTGA 2820 

ATACTGAGGA ATTTTGGTCC CTCAGTGACC TGTGTTGTTA ATTCATTAAT GCATTCTGAG 2880 

TTCACAGAGC AAATTAGGAG AATCATTTCC AACCATTATT TACTGCAGTA TGGGGAGTAA 2940 

ATTTATACCA ATTCCTCTAA CTGTACTGTA ACACAGCCTG TAAAGTTAGC CATATAAATG 3000 

CAAGGGTATA TCATATATAC AAATCAGGAA TCAGGTCCGT TCACCGAACT TCAAATTGAT 3060 

GTTTACTAAT ATTTTTGTGA CAGAGTATAA AGACCCTATA GTGGGTAAAT TAGATACTAT 3120 

TAGCATATTA TTAATTTAAT GTCTTTATCA TTGGATCTTT TGCATGCTTT AATCTGGTTA 3180 

ACATATTTAA ATTTGCTTTT TTTCTCTTTA CCTGAAGGCT CTGTGTATAG TATTTCATGA 3240 

CATCGTTGTA CAGTTTAACT ATATCAATAA AAAGTTTGGA CAGTATTTAA ATATTGCAAA 3300 

TATGTTTAAT TATACAAATC AGAATAGTAT GGGTAATTAA ATGAATACAA AAAGAAGAGC 3360 

CTCTTTCTGC AGCCGACTTA GACATGCTCT TCCCTTTCTA TAAGCTAGAT TTTAGAATAA 3420 

AGGGTTTCAG TTAATAATCT TATTTTCAGG TTATGTCATC TAACTTATAG CAAACTACCA 3480 

CAATACAGTG AGTTCTGCCA GTGTCCCAGT ACAAGGCATA TTTCAGGTGT GGCTGTGGAA 3540 

TGTAAAAATG CTCAACTTGT ATCAGGTAAT GTTAGCAATA AATTAAATGC TAAGAATGAT 3600 

TAATCGGGTA CATGTTACTG TAATTAACTC ATTGCACTTC AAAACCTAAC TTCCATCCTG 3660 

AATTTATCAA GTAGTTCAGT ATTGTCATTT GTTTTTGTTT TATTGAAAAG TAATGTTGTC 3720 

TTAAGATTTA GAAGTGATTA TTAGCTTGAG AACTATTACC CAGCTCTAAG CAAATAATGA 3780 

TTGTATACAT ATTAAGATAA TGGTTAAATG CGGTTTTACC AAGTTTTCCC TTGAAAATGT 3840 

AATTCCTTTA TGGAGATTTA TTGTGCAGCC CTAAGCTTCC TTCCCATTTC ATGAATATAA 3900 

GGCTTCTAGA ATTGGACTGG CAGGGGAAAG AATGGTAGAG ACAGAAATTA AGACTTTATC 3960 

CTTGTTTGCT TGTAAACTAT TATTTTCTTG CTAATGTAAC ATTTGTCTGT TCCAGTGATG 4020 

TAAGGATATT AAGTTATTAA GCTAAATATT AATTTTCAAA AATAGTCCTT CTTTAACTTA 4080 

GATATTTCAT AGCTGGATTT AGGAAGATCT GTTATTCTGG AAGTACTAAA AAGAATAATA 4140 

CAACGTACAA TGTCTGCATT CACTAATTCA TGTTCCAGAA GAGGAAATAA TGAAGATATA 4200 

CTCAGTAGAG TACTAGGTGG GAGGATATGG AAATTTGCTC ATAAAATCTC TTATAAAACG 4260 

TGCATATAAC AAAATGACAC CCAGTAGGCC TGCATTACAT TTACATGACC GTGTTTATTT 4320 

GCCATCAAAT AAACTGAGTA CTGACACCAG ACAAAGACTC CAAAGTCATA AAATAGCCTA 4380 

TGACCAACTG CAGCAAGACA GGAGGTCAGC TCGCCTATAA TGGTGCTTAA AGTGTGATTG 4440 

ATGTAATTTT CTGTACTCAC CATTTGAAGT TAGTTAAGGA GAACTTTATT TTTTTAAAAA 4500 

AAGTAAATGG CAACCACTAG TGTGCTCATC CTGAACTGTT ACTCCAAATC CACTCCGTTT 4560 

TTAAAGCAAA ATTATCTTGT GATTTTAAGA AAAGAGTTTT CTATTTATTT AAGAAAGTAA 4620 

CAATGCAGTC TGCAAGCTTT CAGTAGTTTT CTAGTGCTAT ATTCATCCTG TAAAACTCTT 4680 

ACTACGTAAC CAGTAATCAC AAGGAAAGTG TCCCCTTTGC ATATTTCTTT AAAATTCTTT 4740 

CTTTGGAAAG TATGATGTTG ATAATTAACT TACCCTTATC TGCCAAAACC AGAGCAAAAT 4800 

GCTAAATACG TTATTGCTAA TCAGTGGTCT CAAATCGATT TGCCTCCCTT TGCCTCGTCT 4860 

GAGGGCTGTA AGCCTGAAGA TAGTGGCAAG CACCAAGTCA GTTTCCAAAA TTGCCCCTCA 4920 

GCTGCTTTAA GTGACTCAGC ACCCTGCCTC AGCTTCAGCA GGCGTAGGCT CACCCTGGGC 4980 

GGAGCAAAGT ATGGGCCAGG GAGAACTACA GCTACGAAGA CCTGCTGTCG AGTTGAGAAA 5040 

AGGGGAGAAT TTATGGTCTG AATTTTCTAA CTGTCCTCTT TCTTGGGTCT AAAGCTCATA 5100 

ATACACAAAG GCTTCCAGAC CTGAGCCACA CCCAGGCCCT ATCCTGAACA GGAGACTAAA 5160 

CAGAGGCAAA TCAACCCTAG GAAATACTTG CATTCTGCCC TACGGTTAGT ACCAGGACTG 5220 

AGGTCATTTC TACTGGAAAA GATTGTGAGA TTGAACTTAT CTGATCGCTT GAGACTCCTA 5280 

ATAGGCAGGA GTCAAGGCCA CTAGAAAATT GACAGTTAAG AGCCAAAAGT TTTTAAAATA 5340 

TGCTACTCTG AAAAATCTCG TGAAGGCTGT AGGAAAAGGG AGAATCTTCC ATGTTGGTGT 5400 

TTTTCCTGTA AAGATCAGTT TGGGGTATGA TATAAGCAGG TATTAATAAA AATAACACAC 5460 

CAAAGAGTTA CGTAAAACAT GTTTTATTAA TTTTGGTCCC CACGTACAGA CATTTTATTT 5520 

CTATTTTGAA ATGAGTTATC TATTTTCATA AAAGTAAAAC ACTATTAAAG TGCTGTTTTA 5580 

TGTGAAATAA CTTGAATGTT GTTCCTATAA AAAATAGATC ATAACTCATG ATATGTTTGT 5640 

AATCATGGTA ATTTAGATTT TTATGAGGAA TGAGTATCTG GAAATATTGT AGCAATACTT 5700 

GGTTTAAAAT TTTGGACCTG AGACACTGTG GCTGTCTAAT GTAATCCTTT AAAAATTCTC 5760 

TGCATTGTCA GTAAATGTAG TATATTATTG TACAGCTACT CATAATTTTT TAAAGTTTAT 5820 
GAAGTTATAT TTATCAAATA AAAACTTTCC TATAT 
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MQFRECSING MKYQEINGRL VPEGPTPDSS EGNLSYLSSL SHLNNLSHLT TSSSFRTSPE 60 

NETELIKEHD LFFKAVSLCH TVQISNVQTD CTGDGPWQSN LAPSQLEYYA SSPDEKALVE 120 

AAARIGIVPI GNSEETMEVX TLGKLERYKL LHILEFDSDR RRMSVIVQAP SGEKLLFAKG 180 

AESSILPKCI GGBIEKTRIH VDEFALKGLR TLCIAYRKFT SKEYEEIDKR IFEARTALQQ 240 

REEK1AAVFQ FIEKDLILLG ATAVEDRLQD KVRETIEALR MAGIKVWVLT GDKHETAVSV 300 

SLSCGHFHRT MNILELINQK SDSECAEQLR QLARRITEDH VIQHGLWDG TSLSLALREH 360 

EKLFMEVCRN CSAVLCCRMA PLQKAKV1RL IKISPEKP1T IAVGDGANDV SM1QEAHVGI 420 

GIMGKEGRQA ARNSDYAIAR FKFLSKLLFV HGHFYYIRIA TLVQYFFYKN VCFITPQPLY 480 

QFYCLFSQQT LYDSVYLTLY NICFTSfcPIL IYSLLEQHVD PHVLQNKPTL YRDISKNRLL 540 

SIKTFLYWTI LGFSHAFIFF FGSYLLIGKD TSI*LGKGQMF GNWTFGTLVF TVMVITVTVK 600 

MALETHFWTW INHLVTWGSI IFYFVFSLFY GGILWPFLGS QNMYFVFIQL LSSGSAWFAI 660 

ILMWTCLFL DIIKKVFDRH LHPTSTBKAQ LTETNAGIKC LDSMCCFPEG EAACASVGRM 720 
LERVIGRCSP THISRSWSAS DPFYTNDRSI LTLSTMDSST C 
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CTCGCCAGCG GTCCGCGGGG CTGGAGACCC ACGCCGTGGA GAGGACCAGC CTCAGGTCGC 60 
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CCCGCCTGGG CCCGCGCCCC GACCTOGCTG CCCCCGCCTC GCCTCTCTGC CCGTGGCGCT 120 

' TACCGCCACC TTGGCCTCGG GGGCAGGGCA TGGGCGGCCC CCGCCAGATC GCCCAGGGCC 180 

AGTACTAACT GCCCTCGCTC TGGOCTTCGA GCCCGAAGCC TCTTCTGCGC GCACAACCTA 240 

GGCAGTAATC CTAAACTAGC GGGCACCACA GACCAGCTGC AGCCACCCCA ACCCAGGGAT 300 

CACTTCCGGA CCCCTCGACC GCCCGGCACC AGCGCGCAAG GGACOCTTCA GCCGGAGACC 360 

AGAGTCCAGT CCOGGTOGOG AGGCCACCGC CGCTGCCCGC CTCGAGAAGC ACAACGCGGG 420 

CTGAGCCGTC GGCTAGCGGG TCACTCCCGA GCCTCTGTCT GCACOGCGCC AGCCCCAGAC 480 

CACGGACGCT GAGCCTCCAG CGCGCGCCAG CCTGGGCCGC TGGGCTCTCC GGGCCAGCCC 540 

GCGACGATCC CCTGAGCTCT CCGCAGAAGG GCCGAGCGTC CGTTCCGGGG ACGCCAGGCC 600 

CGCCCCCGCC CCCCGACAGC CGCGGGGATC CAGAGCCCGG GGGTGCGGGA CGCCCGCGCC 660 

ATGACTGCCG AGAGCGGGCC GCCGCCGCCG CAGCCGGAGG TGCTGGCTAC CGTGAAGGAA 720 

GAGCGCGGCG AGACGGCAGC AGGGGCCGGG GTCCCAGGGG AGGCCACGGG CCGCGGGGCG 780 

GGCGGGCGGC GCCGCAAGCG CCCCCTGCAG CGCGGGAAGC CGCCCTACAG CTACATCGCG 840 

CTCATCGCCA TGGCCATCGC GCACGCGCCC GAGCGCCGCC TCACGCTGGG CGGCATCTAC 900 

AAGTTCATCA CCGAGCGCTT CCCCTTCTAC CGCGACAACC CCAAAAAGTG GCAGAACAGC 960 

ATCCGCCACA ACCTCACACT CAACGACTGC TTCCTCAAGA TCCCGCGOGA GGCCGGCCGC 1020 

CCGGGTAAGG GCAACTACTG GGCGCTCGAC CCCAACGCGG AGGACATGTT CGAGAGCGGC 1080 

AGCTTCCTGC GCCGCCGCAA GCGCTTCAAG CGCTCGGACC TCTCCACCTA CCCGGCTTAC 1140 

ATGCACGACG CGGCGGCTGC CGCAGCCGCC GCTGCCGCAG COGCCGCCGC CGCCGCCGCC 1200 

GCCGCCATCT TCCCAGGCGC GGTGCCCGCC GCGCGCCCCC CCTACCCGGG CGCCGTCTAT 1260 

GCAGGCTAOG CGCCGCCGTC GCTGGCOGCG CCGCCTCCAG TCTACTACCC CGCGGCGTCG 1320 

CCCGGCCCTT GCCGCGTCTT CGGCCTGGTT CCTGAGCGGC CGCTCAGCCC AGAGCTGGGG 1380 

CCOGCACCGT CGGGGCCQGG CGGCTCTTGC GCCTTTGCCT CCGCCGGCGC CCCCGCTACC 1440 

ACCACCGGCT ACCAGCCCGC AGGCTGCACC GGGGCCCGGC CGGCCAACCC CTCTGC CTAT 1500 

GCGGCTGCCT ACGCGGGCCC CGACGGCGCG TACCOGCAGG GCGCCGGCAG TGCGATCTTT 1560 

GCCGCTGCTG GCCGCCTGGC GGGACCCGCT TCGCCCCCAG CGGGCGGCAG CAGTGGCGGC 1620 

GTGGAGACCA CGGTGGACTT CTACGGGCGC ACGTCGCCCG GCCAGTTCGG AGCGCTGGGA 1680 

GCCTGCTACA ACCCTGGCGG GCAGCTCGGA GGGGCCAGTG CAGGCGCCTA CCATGCTCGC ' 1740 

CATGCTGCCG CTTATCCCGG TGGGATAGAT CGGTTCGTGT CCGCCATGTG AGCCAGCGTA 1800 

GGGACGAAAA CTCATAGACA CATCGGCTGT TCACACGTTC CCCGCAACCT GAGAACGAAC 1860 

AGGAATGGAG AGAGGACTCA ACTGGGACCC ACGTGGAAAA GACCGAGCAG GCCACAGAGG 1920 

CTCGGTCTCC CCGCGCACAG CGTAGGCACC CTGTGTACTC TGTAAACGGG AGGAGGTGGG 1980 

GCGAGGCAGC CAGAGCCCTT GGACTGGCAC AGGGACCCTC GATGGAGCGA AGCCCTCAAA 2040 

CGGGATGCTT TCTGGCATTC TATCGGGGAG GGTCCTTGGC GGTAACCAGA GGGCAGCGTA 2100 

GTGTCAACAC CAGAGACCAG GATCCAAATT GTGGGGAATC AGTTTCAGCC TTCCATGTGC 2160 

TGCCGGAACT CGGGCCTTTT TACGCGGTTC GTCCTCTAGT GCCTTTAACT GCGTTACTAC 2220 

AATAAAAGGC TGCGGCAGCG CCTTTCTTCT TAAAGTGAGG AGGACAAATT TGCAAAAGAA 2280 

ATAGGCTTTT CTTCTTTTTT AAATTGGAGA AATCTCTGCT CTGGTTGACC TGGGCTGGTT 2340 

TTCCCTGTCT CTGAGAACTT GAGACCTAGC TCCGAGTTGA ACTGTGCGTC AGCACTCCAG 2400 

TCCCATCACC TGAACCTTCA GTCTCCCCCA TCTGTTACAC TAGAGGGCTG CAGGACTCTA 2460 

TCCACCGCCC CCGGGTTATC ATTCAGGGCC CCATCATCTT GGATGCTGCC CTGCGTATTT 2520 

GGCAGCAATG GTGGGCCACC CAGGGCCTCT GAGTAGCCAC CCAAAGCCTA GCCGCTGTTC 2580 

TAGGGAACGG AAAAGAGTTC ATGGCCAAGC GTCTAACCTA AAGTCCCAGG ATTGGCTCCA 2640 

GGCAGCAATT ATATCATAAC TTATTGAACT TTTGAGCAGG ACGTGCTGGT AATTTCATGG 2700 

CTGTTACTGC CCAGTCATAA ATCTGCTTTT CCATTATAAG GCAGAGAGAA GTACATTCGT 2760 

TCATTTGTCC ACTGTTTCTT GTCATCACGC AGCCCTGGAC CCAAAGGGTG AACTAAAGTT 2820 

TAAGGAGATG AGAGGATTCA AGGAGCCCGT TGGTGACGCC TTTCAGTAGC TGGGGAGGGC 2880 

TCTTCCATCC CCAGCACCCC CTGCTACACC TCAGCAGCCT CCCCCATGCA AAAAGGAAAG 2 940 

AGAAAAATTA AGTTAGGGCA GTCAGTAAAG TGAGCTTTAG AAAGAAACTG GAATTTTAAC 3000 

TTCATTTTGT ATCTTGCTTA AGTAGCAGGC TCACTAAAAT TAGAGAAAGT CCAATAACTC 3060 

TCCCCCTTTC CCTTGAGAAA TCTTTAAGTT TCGATTCTGG AGCAAAAACT TTCAGCATTA 3120 

AATATTTCAG AGGCTCCATT CACAGCTTTC AGATAAACTG GAGTGTTCAG ATGGACTGTT 3180 

TTAATAAAAA TCTTTGAGCA AGTGAGTTAT GGCAAGAGAA ACTCAGCCTC TTTCTGTATA 3240 

AACTTAACAG GGAAGGGCTG GGGTGTGAAA AAGAAGATTG TATGAAAACC ATTGGTAATT 3300 

TTTATTTTTT ATTTTTGGGA CTGCACTATC CTGTTCACGA AGACATGTGA ACTTGGTTCA 3360 

GTCCAAATGG GGATTTGTAT AAACCAGTGC TCTCCATTAG AAATATGGTG CAAGCCACAT 3420 

ATGTAATTTT AAATATTCTA GTAGCCACAT TAATAAAGTN AAAAGAAACA AAAAAAAAAA 3480 
AA 
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FKHLTHYROI DTRANSCRIP TIONFACTOR TTFMTAESGP PPPQPBVLAT VKEERGETAA 60 

GAGVPGEATG RGAGGRRRKR PLQRGKPPYS YIALIAMAIA HAPERRLTLG GIYKFITERF 120 

PFYRDNPKKW QNSIRHNLTL NDCFLKIPRE AGRPGKGNYW ALDPNAEDMF ESGSFLRRRK 180 

RFKRSDLSTY PAYMHDAAAA AAAAAAAAAA AAAAAIFPGA VPAARPPYPG AVYAGYAPPS 240 

LAAPPPVYYP AASPGPCRVF GLVPERPLSP ELGPAPSGPG GSCAFASAGA PATTTGYQPA 300 

GCTGARPANP SAYAAAYAGP DGAYPOGAGS AIFAAAGRLA GPASPPAGGS SGGVETTVDF 360 
YGRTSPGQFG ALGACYNPGG QLGGASAGAY HARHAAAYPG G1DRFVSAM 
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CCGGGCAGGT GGCTCATGCT CGGGAGCGTG GTTGAGOGGC TGGCGCGGTT GTCCTGGAGC 60 

AGGGGCGCAG GAATTCTGAT GTGAAACTAA CAGTCTGTGA GCCCTGGAAC CTCCGCTCAG 120 

AGAAGATGAA GGATATCGAC ATAGGAAAAG AGTATATCAT CCCCAGTCCT GGGTATAGAA 180 

GTGTGAGGGA GAGAACCAGC ACTTCTGGGA CGCACAGAGA CCGTGAAGAT TCCAAGTTCA 240 

GGAGAACTCG ACCGTTGGAA TGCCAAGATG CCTTGGAAAC AGCAGCCCGA GCCGAGGGCC 300 

TCTCTCTTGA TGCCTCCATG CATTCTCAGC TCAGAATCCT GGATGAGGAG CATCCCAAGG 360 

GAAAGTACCA TCATGGCTTG AGTGCTCTGA AGCCCATCCG GACTACTTCC AAACACCAGC 420 

ACCCAGTGGA CAATGCTGGG CTTTTTTCCT GTATGACTTT TTCGTGGCTT TCTTCTCTGG 480 
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CCCGTGTGGC CCACAAGAAG GGGGAGCTCT CAATGGAAGA CGTGTGGTCT CTGTCCAAGC 540 

ACGAGTCTTC TGACGTGAAC TGCAGAAGAC TAGAGAGACT GTGGCAAGAA GAGCTGAATG 600 

AAGTTGGGCC AGACGCTGCT TCCCTGCGAA GGGTTGTGTG GATCTTCTGC CGCACCAGGC 660 

TCATOCTGTC CATCGTGTGC CTGATGATCA CGCAGCTGGC TGGCTTCAGT GGACCAGCCT 720 

TCATGGTGAA ACACCTCTTG GAGTATACCC AGGCAACAGA GTCTAACCTG CAGTACAGCT 780 

TGTTGTTAGT GCTGGGCCTC CTCCTGACGG AAATCGTGCG GTCTTGGTCG CTTGCACTGA 840 

CTTGGGCATT GAATTACCGA ACCGGTGTCC GCTTGCGGGG GGCCATCCTA ACCATGGCAT 900 

TTAAGAAGAT CCTTAAGTTA AAGAACATTA AAGAGAAATC CCTGGGTGAG CTCATCAACA 960 

TTTGCTCCAA CGATGGGCAG AGAATGTTTG AGGCAGCAGC OGTTGGCAGC CTGCTGGCTG 1020 

GAGGACCCGT TGTTGCCATC TTAGGCATGA TTTATAATGT AATTATTCTG GGACCAACAG 1080 

GCTTCCTGGG ATCAGCTGTT TTTATCCTCT TTTACCCAGC AATGATGTTT GCATCACGGC 1140 

TCACAGCATA TTTCAGGAGA AAATGCGTGG CCGCCACGGA TGAACGTGTC CAGAAGATGA 1200 

ATGAAGTTCT TACTTACATT AAATTTATCA AAATGTATGC CTGGGTCAAA GCATTTTCTC 1260 

AGAGTGTTCA AAAAATCCGC GAGGAGGAGC GTCGGATATT GGAAAAAGCC GGGTACTTCC 1320 

AGGGTATCAC TGTGGGTGTG GCTCCCATTG TGGTGGTGAT TGCCAGCGTG GTGACCTTCT 1380 

CTGTTCATAT GACCCTGGGC TTCGATCTGA CAGCAGCACA GGCTTTCACA GTGGTGACAG 1440 

TCTTCAATTC CATGACTTTT GCTTTGAAAG TAACACCGTT TTCAGTAAAG TCCCTCTCAG 1500 

AAGCCTCAGT GGCTGTTGAC AGATTTAAGA GTTTGTTTCT AATGGAAGAG GTTCACATGA 1560 

TAAAGAACAA ACCAGCCAGT CCTCACATCA AGATAGAGAT GAAAAATGCC ACCTTGGCAT 1620 

GGGACTCCTC CCACTCCAGT ATCCAGAACT CGCCCAAGCT GACCCCCAAA ATGAAAAAAG 1680 

ACAAGAGGGC TTCCAGGGGC AAGAAAGAGA AGGTGAGGCA GCTGCAGCGC ACTGAGCATC 1740 

AGGCGGTGCT GGCAGAGCAG AAAGGCCACC TCCTCCTGGA CAGTGACGAG CGGCCCAGTC 1800 

CCGAAGAGGA AGAAGGCAAG CACATCCACC TGGGCCACCT GCGCTTACAG AGGACACTGC 1860 

ACAGCATCGA TCTGGAGATC CAAGAGGGTA AACTGGTTGG AATCTGCGGC AGTGTGGGAA 1920 

GTGGAAAAAC CTCTCTCATT TCAGCCATTT TAGGCCAGAT GACGCTTCTA GAGGGCAGCA 1980 

TTGCAATCAG TGGAACCTTC GCTTATGTGG CCCAGCAGGC CTGGATCCTC AATGCTACTC 2040 

TGAGAGACAA CATCCTGTTT GGGAAGGAAT ATGATGAAGA AAGATACAAC TCTGTGCTGA 2100 

ACAGCTGCTG CCTGAGGCCT GACCTGGCCA TTCTTCCCAG CAGCGACCTG ACGGAGATTG 2160 

GAGAGCGAGG AGCCAACCTG AGCGGTGGGC AGCGCCAGAG GATCAGCCTT GCCCGGGCCT 2220 

TGTATAGTGA CAGGAGCATC TACATCCTGG ACGACCCCCT CAGTGCCTTA GATGCCCATG 2280 

TGGGCAACCA CATCTTCAAT AGTGCTATCC GGAAACATCT CAAGTCCAAG ACAGTTCTGT 2340 

TTGTTACCCA CCAGTTACAG TACCTGGTTG ACTGTGATGA AGTGATCTTC ATGAAAGAGG 2400 

GCTGTATTAC GGAAAGAGGC ACCCATGAGG AACTGATGAA TTTAAATGGT GACTATGCTA 2460 

CCATTTTTAA TAACCTGTTG CTGGGAGAGA CACCGCCAGT TGAGATCAAT TCAAAAAAGG 2520 

AAACCAGTGG TTCACAGAAG AAGTCACAAG ACAAGGGTCC TAAAACAGGA TCAGTAAAGA 2580 

AGGAAAAAGC AGTAAAGCCA GAGGAAGGGC AGCTTGTGCA GCTGGAAGAG AAAGGGCAGG 2640 

GTTCAGTGCC CTGGTCAGTA TATGGTGTCT ACATCCAGGC TGCTGGGGGC CCCTTGGCAT 2700 

TCCTGGTTAT TATGGCCCTT TTCATGCTGA ATGTAGGCAG CACCGCCTTC AGCACCTGGT 2760 

GGTTGAGTTA CTGGATCAAG CAAGGAAGCG GGAACACCAC TGTGACTCGA GGGAACGAGA 2820 

CCTGGGTGAG TGACAGCATG AAGGACAATC CTCATATGCA GTACTATGCC AGCATCTACG 2880 

CCCTCTCCAT GGCAGTCATG CTGATCCTGA AAGCCATTCG AGGAGTTGTC TTTGTCAAGG 2940 

GCACGCTGCG AGCTTCCTCC CGGCTGCATG ACGAGCTTTT CCGAAGGATC CTTCGAAGCC 3000 

CTATGAAGTT TTTTGACACG ACCCCCACAG GGAGGATTCT CAACAGGTTT TCCAAAGACA 3060 

TGGATGAAGT TGACGTGCGG CTGCCGTTCC AGGCCGAGAT GTTCATCCAG AACGTTATCC 3120 

TGGTGTTCTT CTGTGTGGGA ATGATCGCAG GAGTCTTCCC GTGGTTCCTT GTGGCAGTGG 3180 

GGCCCCTTGT CATCCTCTTT TCAGTCCTGC ACATTGTCTC CAGGGTCCTG ATTCGGGAGC 3240 

TGAAGCGTCT GGACAATATC ACGCAGTCAC CTTTCCTCTC CCACATCACG TCCAGCATAC 3300 

AGGGCCTTGC CACCATCCAC GCCT ACAAT A AAGGGCAGGA GTTTCTGCAC AGATACCAGG 3360 

AGCTGCTGGA TGACAACCAA GCTCCTTTTT TTTTGTTTAC GTGTGCGATG CGGTGGCTGG 3420 

CTGTGCGGCT GGACCTCATC AGCATCGCCC TCATCACCAC CACGGGGCTG ATGATCGTTC 3480 

TTATGCACGG GCAGATTCCC CCAGCCTATG CGGGTCTCGC CATCTCTTAT GCTGTCCAGT 3540 

TAACGGGGCT GTTCCAGTTT ACGGTCAGAC TGGCATCTGA GACAGAAGCT CGATTCACCT 3600 

CGGTGGAGAG GATCAATCAC TACATTAAGA CTCTGTCCTT GGAAGCACCT GCCAGAATTA 3660 

AGAACAAGGC TCCCTCCCCT GACTGGCCCC AGGAGGGAGA GGTGACCTTT GAGAACGCAG 3720 

AGATGAGGTA CCGAGAAAAC CTCCCTCTTG TCCTAAAGAA AGTATCCTTC ACGATCAAAC 3780 

CTAAAGAGAA GATTGGCATT GTGGGGCGGA CAGGATCAGG GAAGTCCTCG CTGGGGATGG 3840 

CCCTCTTCCG TCTGGTGGAG TTATCTGGAG GCTGCATCAA GATTGATGGA GTGAGAATGA 3 900 

GTGATATTGG CCTTGCCGAC CTCCGAAGCA AACTCTCTAT CATTCCTCAA GAGCCGGTGC 3960 

TGTTCAGTGG CACTGTCAGA TCAAATTTGG AOCCCTTCAA CCAGTACACT GAAGACCAGA 4020 

TTTGGGATGC CCTGGAGAGG ACACACATGA AAGAATGTAT TGCTCAGCTA CCTCTGAAAC 4080 

TTGAATCTGA AGTGATGGAG AATGGGGATA ACTTCTCAGT GGGGGAACGG CAGCTCTTGT 4140 

GCATAGCTAG AGCCCTGCTC CGCCACTGTA AGATTCTGAT TTTAGATGAA GCCACAGCTG 4200 

CCATGGACAC AGAGACAGAC TTATTGATTC AAGAGACCAT CCGAGAAGCA TTTGCAGACT 4260 

GTACCATGCT GACCATTGCC CATCGCCTGC ACACGGTTCT AGGCTCCGAT AGGATTATGG 4320 

TGCTGGCCCA GGGACAGGTG GTGGAGTTTG ACACCCCATC GGTCCTTCTG TCCAACGACA 4380 

GTTCCCGATT CTATGCCATG TTTGCTGCTG CAGAGAACAA GGTCGCTGTC AAGGGCTGAC 4440 

TCCTCCCTGT TGACGAAGTC TCTTTTCTTT AGAGCATTGC CATTCCCTGC CTGGGGCGGG 4500 

CCCCTCATCG CGTCCTCCTA CCGAAACCTT GCCTTTCTCG ATTTTATCTT TCGCACAGCA 4560 

GTTCCGGATT GGCTTGTGTG TTTCACTTTT AGGGAGAGTC ATATTTTGAT TATTGTATTT 4620 

ATTCCATATT CATGTAAACA AAATTTAGTT TTTGTTCTTA ATTGCACTCT AAAAGGTTCA 4680 

GGGAACCGTT ATTATAATTG TATCAGAGGC CTATAATGAA GCTTTATACG TGTAGCTATA 4740 

TCTATATATA ATTCTGTACA TAGCCTATAT TTACAGTGAA AATGTAAGCT GTTTATTTTA 4800 

TATTAAAATA AGCACTGTGC TAATAACAGT GCATATTCCT TTCTATCATT TTTGTACAGT 4860 

TTGCTGTACT AGAGATCTGG TTTTGCTATT AGACTGTAGG AAGAGTAGCA TTTCATTCTT 4920 

CTCTAGCTGG TGGTTTCAOG GTGCCAGGTT TTCTGGGTGT CCAAAGGAAG ACGTGTGGCA 4900 

ATAGTGGGCC CTCCGACAGC CCCCTCTGCC GCCTCCCCAC AGCCGCTCCA GGGGTGGCTG 5040 

GAGACGGGTG GGCGGCTGGA GACCATGCAG AGCGCCGTGA GTTCTCAGGG CTCCTGCCTT 5100 

CTGTCCTGGT GTCACTTACT GTTTCTGTCA GGAGAGCAGC GGGGCGAAGC CCAGGCCCCT 5160 

TTTCACTCCC TCCATCAAGA ATGGGGATCA CAGAGACATT CCTCCGAGCC GGGGAGTTTC 5220 

TTTCCTGCCT TCTTCTTTTT GCTGTTGTTT CTAAACAAGA ATCAGTCTAT CCACAGAGAG 5280 

TCCCACTGCC TCAGGTTCCT ATGGCTGGCC ACTGCACAGA GCTCTCCAGC TCCAAGACCT 5340 

GTTGGTTCCA AGCCCTGGAG CCAACTGCTG CTTTTTGAGG TGGCACTTTT TCATTTGCCT 5400 

ATTCCCACAC CTCCACAGTT CAGTGGCAGG GCTCAGGATT TCGTGGGTCT GTTTTCCTTT 5460 

CTCACCGCAG TCGTCGCACA GTCTCTCTCT CTCTCTCCCC TCAAAGTCTG CAACTTTAAG 5520 

CAGCTCTTGC TAATCAGTGT CTCACACTGG CGTAGAAGTT TTTGTACTGT AAAGAGACCT 5580 

ACCTCAGGTT GCTGGTTGCT GTGTGGTTTG GTGTGTTCCC GCAAACCCCC TTTGTGCTGT 5640 

GGGGCTGGTA GCTCAGGTGG GCGTGGTCAC TGCTGTCATC AGTTGAATGG TCAGCGTTGC 5700 
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ATGTCGTGAC CAACTAGACA TTCTGTCGCC TTAGCATGTT TGCTGAACAC CTTGTGGAAG 5760 

CAAAAATCTG AAAATGTGAA TAAAATTATT TTGGATTTTG TAAAAAAAAA AAAAAAAAAA 5820 
AAAAAAAAAA AAAAAAAA 
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MKDIDIGKEY IIPSPGYRSV RERTSTSGTH RDREDSKFRR TRPLECQDAIt BTAARAEGLS 60 

LDASMHSQLR ILDEEHPKGK YHHGLSALKP IRTTSKHQHP VDNAGLFSCM TFSWLSSLAR 120 

VAHKKGELSM EDVWSLSKHB SSDVNCRRLE RLWQEELNEV GPDAASLRRV VWIFCRTPXI 180 

LSIVCLMITQ LAGFSGPAFM VKHLLEYTQA TESNLQYSLL LVLGLLLTEI VRSWSLALTW 240 

ALNYRTGVRL RGAILTMAFK KILKltKNIKE KSLGELINIC SNDGQRMPEA AAVGSLLAGG 300 

PWAILGMIY NVIILGPTGF LGSAVFILFY PAMMFASRLT AYFRRKCVAA TDERVQKMNE 360 

VLTYIKFIKM YAWVKAFSQS VQKIREEERR ILEKAGYFQG ITVGVAPIW VIASWTFSV 420 

HMTLGFDLTA AQAFTWTVF NSMTFALKVT PFSVKSLSEA SVAVDRFKSIi FLMEEVHMIK 480 

NKPASPHIKI EMKNATLAWD SSHSSIQNSP KLTPKMKKDK RASRGKKEKV RQLQRTEHQA 540 

VLAEQKGHLL LDSDERPSPE EEEGKHIHLG HLRLQRTLHS IDLEIQEGKL VGICGSVGSG 600 

KTSLISAILG QMTLLEGSIA ISGTFAYVAQ QAWILNATLR DNILFGKEYD EERYNSVLNS 660 

CCLRPDLAIL PSSDLTEIGE RGANLSGGQR QRISIiARALY SDRSIYILDD PLSALDAHVG 720 

NHIFNSAIRK HLKSKTVLFV THQLQYLVDC DEVI FM KEG C ITERGTHEEL MNLNGDYATI 780 

FNNLLLGETP PVEINSKKET SGSQKKSQDK GPKTGSVKKE KAVKPEEGQL VQLEEKGQGS 840 

VPWSVYGVYI QAAGGPLAFL VIMALFMLNV GSTAFSTWWL SYWIKQGSGN TTVTRGNETS 900 

VSDSMKDNPH MQYYASIYAL SMAVMLILKA IRGWFVKGT LRASSRLHDE LFRRILRSPM 960 

KFFDTTPTGR ILNRFSKDMD EVDVRLPFQA EMFIQNVILV FFCVGMIAGV FPWFLVAVGP 1020 

LVILFSVLHI VSRVLIRELK RLDNITQSPF LSHITSSIQG LATIHAYNKG QEFLHRYQEL 1080 

LDDNQAPFFL FTCAMRWLAV RLDLISIALI TTTGLMIVLM HGQIPPAYAG LAISYAVQLT 1140 

GLFQFTVRLA SETEARFTSV ERINHYIKTL SLEAPARIKN KAPSPDWPQE GEVTFENAEM 1200 

RYRENLPLVL KKVSFTIKPK EKIGIVGRTG SGKSSLGMAL FRLVELSGGC IKIDGVRISD 1260 

IGLADLRSKL SIIPQEPVLF SGTVRSNLDP FNQYTEDQIW DALERTHMKE CIAQLPLKLE 1320 

SEVMENGDNF SVGERQLLCI ARALLRHCKI LILDEATAAM DTETDLLIQE TIREAFADCT 13 80 
MLTIAHRLHT VLGSDRIMVL AQGQWEFDT PSVLLSNDSS RFYAMFAAAE NKVAVKG 

Seq ID NO: 320 DNA sequence 

Nucleic Acid Accession ft: AK022089.1 

Coding sequence: 181*1488 

1 11 21 31 41 51 

I I I I I I 

AGCAGTTGCA CAACTTCCAG CAACTTTCTC AGCCGGCTAC TAATGAGCTG AAAGCCAGGA 60 

ACATCCGAGG AGAAGAGAAA GCTTCCAGCC CTCCTCCCTT CACCCTGGAA ATCCAGACAC 120 

CCCCACCCCC ACCCTCAGAT CACTTTAAGA TAATTTCTTT ATTCGTTTGC CCGACAGACC 180 

ATGGCTCCCT TTGGAAGAAA CTTGCTAAAG ACTCGGCATA AAAACAGATC TCCAACTAAA 240 

GACATGGATT CAGAAGAGAA GGAAATTGTG GTTTGGGTTT GCCAAGAAGA GAAGCTTGTC 300 

TGTGGGCTGA CTAAACGCAC CACCTCTGCT GATGTCATCC AGGCTTTGCT TGAGGAACAT 360 

GAGGCTACGT TTGGAGAGAA ACGATTTCTT CTGGGGAAGC CCAGTGATTA CTGCATCATA 420 

GAGAAGTGGA GAGGCTCCGA AAGGGTTCTT CCTCCACTAA CTAGAATCCT GAAGCTTTGG 480 

AAAGCGTGGG GAGATGAGCA GCCCAATATG CAATTTGTTT TGGTTAAAGC AGATGCTTTT 540 

CTTCCAGTTC CTTTGTGGCG GACAGCTGAA GCCAAATTAG TGCAAAACAC AGAAAAATTG 600 

TGGGAGCTCA GCCCAGCAAA CTACATGAAG ACTTTACCAC CAGATAAACA AAAAAGAATA 660 

GTCAGGAAAA CTTTCCGGAA ACTGGCTAAA ATTAAGCAGG ACACAGTTTC TCATGATCGA 720 

GATAATATGG AGACATTAGT TCATCTGATC ATTTCCCAGG ACCATACTAT TCATCAGCAA 780 

GTCAAGAGAA TGAAAGAGCT GGATCTGGAA ATTGAAAAGT GTGAAGCTAA GTTCCATCTT 840 

GATCGAGTAG AAAATGATGG AGAAAACTAT GTTCAGGATG CATATTTAAT GCCCAGTTTC 900 

AGTGAAGTTG AGCAAAATCT AGACTTGCAG TATGAGGAAA ACCAGACTCT GGAGGACCTG 960 

AGCGAAAGTG ATGGAATTGA ACAGCTGGAA GAACGACTGA AATATTACCG AATACTCATT 1020 

GATAAGCTCT CTGCTGAAAT AGAAAAAGAG GTAAAAAGTG TTTGCATTGA TATAAATGAA 1080 

GATGCGGAAG GGGAAGCTGC AAGTGAACTG GAAAGCTCTA ATTTAGAGAG TGTTAAGTGT 1140 

GATTTGGAGA AAAGCATGAA AGCTGGTTTG AAAATTCACT CTCATTTGAG TGGCATCCAG 1200 

AAAGAGATTA AATACAGTGA CTCATTGCTT CAGATGAAAG CAAAAGAATA TGAACTCCTG 1260 

GCCAAGGAAT TCAATTCACT TCACATTAGC AACAAAGATG GGTGCCAGTT AAAGGAAAAC 1320 

AGAGCGAAGG AATCTGAGGT TCCCAGTAGC AATGGGGAGA TTCCTCCCTT TACTCAAAGA 1380 

GTATTTAGCA ATTACACAAA TGACACAGAC TCGGACACTG GTATCAGTTC TAACCACAGT 1440 

CAGGACTCCG AAACAACAGT AGGAGATGTG G7GCTGTTGT CAACATAGTT CCAATGGCTC 1500 

CTTTCTGACC TGCTTTCATG TTTTAATGTT TGTTTAATTT AATAGGAAAC CTCATTTTAA 1560 

ATATAACACT CAAAAAAATG TAAATCATAT TGTAGTATTC AATAGTTAAT AAAAACTCGA 1620 
GAAATGTGTT GTTTCTG 

Seq ID NO: 321 Protein sequence: 
Protein Accession #: NPJJ05438.1 

1 11 21 31 41 51 

I I I I I I 

MAPFGRNLLK TRHKNRSPTK DMDSEEKEIV VWVCQEEKLV CGLTKRTTSA DVIQALLEEH 60 

EATFGEKRFL LGKPSDYCII EKWRGSERVL PPLTRILKLW KAWGDEQPNM QFVLVKADAF 120 

LPVPLWRTAE AKLVQNTEKL WELSPANYMK TLPPDKQKRI VRKTFRKLAK IKQDTVSHDR 180 

DNMETLVHLI ISQDHTIHQQ VKRMKELDLE IEKCEAKFHL DRVENDGENY VQDAYLMPSF 240 

SEVEQNLDLQ YEENQTLEDL SESDGIEQLE ERLKYYRILI DKLSAEIEKE VKSVCIDINE 300 

DAEGEAASEL ESSNLESVKC DLEKSMKAGL KIHSHLSGIQ KEIKYSDSLL QMKAKEYELL 360 

AKEFNSLHIS NKDGCQLKEN RAKES EVPSS NGEIPPFTQR VFSNYTNDTD SDTGISSNHS 420 
QDSETTVGDV VLLST 

Seq ID NO: 322 DNA sequence 

Nucleic Acid Accession ih NMJ>30920.1 
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Coding sequence i 317-1123 

1 n 21 31 41 SI 

I I I I I I 

AGCATTGAAG GGGAAGGAAC TGCGGGTGTG GTGTGTGTAT GTGTGTGTGT ATGTGTGTGC 60 

GGCGCGTGCG TGCGTGTGTG TGCGCGOGCT AGTGTGTGGA CAAGGAGGTG GGGGCAGCTG 120 

AGTTAGAGTC CCAACTCTTG GACTCCATTT GCTATTCTCT TCTTTCTCCC CCACACCTAT 180 

CTGGTGGTGG TAGTGGGCGT TTATATTTGC GTTCCTTTTC ATTCATTTCT AAATCTCTTA 240 

AAAATTTTGG GTTGGGGGTA TTGGGGAAGG CAGGAAAGGG AAAAGGAGAG TAGTAGCTGA 300 

AGAGCAAGAG GAGGACATGG AGATGAAGAA GAAGATTAAC CTGGAGTTAA GGAACAGATC 360 

CCCGGAGGAG GTGACAGAGT TAGTCCTTGA TAATTGCCTG TGTGTCAATG GGGAAATTGA 420 

AGGCCTGAAT GATACTTTCA AAGAACTAGA ATTTCTGAGT ATGGCTAATG TGGAACTAAG 4 BO 

TTCGCTGGCC CGGCTTCCCA GCTTAAATAA ACTTCGAAAA TTGGAGCTTA GTGATAATAT 540 

AATTTCTGGA GGCTTGGAAG TCCTGGCAGA GAAATGTCCA AATCTTACCT ACCTCAATCT 600 

GAGTGGAAAC AAAATAAAAG ATCTCAGTAC AGTAGAAGCT CTGCAAAATC TTAAAAATTT 660 

GAAAAGTCTT GACCTGTTTA ACTGTGAGAT CACAAACCTG GAAGATTATA GAGAAAGTAT 720 

TTTTGAACTA CTGCAGCAAA TCACATACTT AGATGGATTT GATCAGGAGG ATAATGAAGC 780 

GCCGGACTCT GAAGAGGAGG ATGATGAGGA TGGAGATGAA GATGATGAAG AGGAAGAGGA 840 

AAATGAAGCT GGTCCACCGG AAGGATATGA GGAAGAGGAG GAGGAAGAGG AAGAGGAGGA 900 

TGAGGATGAG GATGAAGATG AAGATGAAGC AGGTTCAGAG TTGGGAGAGG GAGAAGAGGA 960 

AGTGGGCCTC TCATACTTAA TGAAAGAAGA AATTCAGGAT GAAGAAGATG ATGATGACTA 1020 

TGTTGAAGAA GGGGAAGAAG AGGAAGAAGA GGAAGAAGGA GGTCTTCGAG GGGAGAAGAG 1080 

GAAACGAGAT GCTGAAGACG ATGGAGAGGA AGAAGATGAC TAGATCATTC TAAGACCAGA 1140 

TTCTCTAATG TTTCTGGGTG TGCAATAGAG TGATCACATC TTTGTTTCTT CATGTACGAT 1200 

AGCTATCCCT ACAGAAGATA ATGTGTAACT TTTTATAGGA AAAGTGTGGT TTTACTATTT 1260 

TTGCCTTATC ATTCCAAATA AGAACTAGTC TGTTAATGAT CATATTGTAT GTAGAGAAAA 1320 

ATTTTCATTG ACTCCCATTG TGGAATTCCC TAGCAATTTA TTTAGACTTA ATTTT TTAAA 1380 

TTCAAGCTTA CTGTATTAGT CATTTTTAGC CCATAATTAA AACATGATCA CTTTTAAACA 1440 

GGTGTAGTAT GGTGCATTTC ATTCCTTATT TATAGATTAA CTGAAATTAC AGTTTGCTAT 1500 

AATATAAAAT GACAATAGTC TCTTGAGTGG TAAGTTGGTT ATTTTTTTAG AGGTGATCCA 1560 

GGAATCTTTA GTTTGAAGGC AGTTACCTTT TTTTTTTTTT TTTTTTTTTG ACTAAGAGTG 1620 

XTTGGTTGCT TTTTTGTCAC AAGTAACTTG GAAAATAGAA GCAGAATAGT AAAGGTTCTA 1680 

TTCAGCAACA TAGTTCATGG ATTTTGTGGA GGTTCTATTC AGTAATATGG TTCATGGATT 1740 

TAGTGGTGAC TGATAAGATT TTATTTTTGA AGGAAAAATT GCTTATACTA AGTCCAGAGA 1800 

CATGCAGGTG AGCCCTTTTG TCAGGCTGCA AATCATGACA TGCCGATGGT TGTTTATTTT 1B60 

GTTTTTAGGT GTGCATTCTT TTTCTTCTTA GCAATTCCTT TATGATCACC TTCCCTTCTT 1920 

GTTTCACTCC CTCCCGCTCT CTCAAAAGGA ACTTGGGAAA CTTGTGAAAC CCAGGAAAAC 1980 

CTTTAGTCTT ATACCTCAAC TACGTTTCAG TCCTGTCTGG GTTTTAAATA AGTGAAGTAG 2040 

AAGAAATTGA GTATTTTCTG ACATAAGAAT ATATTATCAA TACAGTTTTA TGCAGTAAGC 2100 

TCTCCTTACC ATAAATGTTT CTTGGTTGAC AACATCTAAG ACAATATTAG TGGGATGAAG 2160 

AAAGAAAAGC AGGGGTGCTT TTGGAAGCAG TGTTAGTGTT CCTCAAAAGT CGGAACAATT 2220 

GCCTGTTGAT ATATTAATAA GACATTAAAG TCAAATTTTA ATGTTGGCCT CTCAAATGAT 2280 

TTGGATACCA CTCTGCAAAG TATTTCTAAC CTTTAATTCC CAGTTTTAAA ACAGATATAA 2340 

TAATAGCATT TAATTGGAAT ATACTAGGCA GCTGGAAAAG TATTTGAAAC TAAATTGACA 2400 

TTAAAATTAA GATTTGTTTT CAAGTGGATG TCCATTAAAA GTAGAAAAAT ATTTGGGATA 2460 

AGTGAGTGTG TGTTTCCTTA CATGGCTACT AAATAAAATA TAATGAGTAT ACAAGTATAT 2520 

CTCCTCTTTT GCTATGGAGG CTCCATGTTC AAGGCAATGG CTTTTTAAAT CTTGGCTATC 2580 

TAAAATTTTT TCCCTTTGTT TTGAATATTT GTAAGTTTTT AAGAAGTTAG TGTCAGCAAA 2640 

TTAATTGAAG TTATGCTTCT ATACTGGGAC ATATTTAAAT ACTGAGTATA GTACTGCTGC 2700 

TACTGCTTCT ACAATGTAAA ATGTATGACT TGGTGTTTTA AAGTAAAAAT TATGATGTTA 2 760 

CTTGTGGAGA AACTAAAAAT GTTGTACAAC TGACCGAAAG AAAACCCTTG GGGATAAGTT 2820 

TAGTGAGGGG ATTGGAATCC CCAAAAAGAT AACATTTTTC TTCTGCTTTT AAAAACTGAA 2880 

ATTCCCTGTT CTAGTTCCTA ACAATTCTCA TTACATACTA TGCCAGATTA CAAAATACTT 2940 

ATTTTTAAAA TGAAATCTAT ATATTGACTT TCTTATCAAT CATCTTACTG TGCAATCAAA 3000 

ATTAGAGTAC TTTGGTTTGA AAACAACACT TAGAGCCTCC AGATAACTTT TAAGACTTAT 3060 

TTAGCTTTGT GGGTGGTATT TTCATGCAAA TAAGTAAGGG TGGGTTTTAT ATTTTGTAGA 3120 

AGTTTTCGGT CCTATTTTAA TGCTCTTTGT ATGGCAGTAT GTATATATTG TGTTAAGTTC 3180 

CTCAAGAATC TCCTTAAAAA CTTTGAAGTT AATACTTTTG TGCAACTGTG TTTTGAATAA 3240 
AGCCATGACA GTGTTAAAAA CAAAC 

Seq ID NO: 323 Protein sequence: 
Protein Accession ft: NP_112182.1 

1 11 21 31 41 51 

I I I I I I 

MEMKKKINLE LRNRSPEEVT ELVLDNCLCV NGEIEGLNDT FKELEFLSMA NVELSSLARL 60 

PSLNKLRKLE LSDNIISGGL EVLAEKCPNL TYLNLSGNKI KDLSTVEALQ NLKNLKSLDL 120 

FNCEITNLED YRESIFELLQ QITYLDGPDQ EDNEAPDSEE EDDEDGDEDD EEEEENEAGP 180 

PEGYEEEEEE EEEEDEDEDE DEDEAGSELG EGEEEVGLSY LMKEEIQDEE DDDDYVEEGE 240 
EEEEEEEGGL RGEKRKRDAE DDGEEEDD 

Seq ID NO: 324 DNA sequence 
Nucleic Acid Accession fc: NM_003812 
Coding sequence: 224.. 2722 

1 11 21 31 41 51 

I I I I I 1 

TCCTCTGCGT CCCGCCCCGG GAGTGGCTGC GAGGCTAGGC GAGCCGGGAA AGGGGGCGCC 60 

GCCCAGCCCC GAGCCCCGCG CCCCGTGCCC CGAGCCCGGA GCCCCCTGCC CGCGGCGGCA 120 

CCATGCGCGC CGAGCCGGCG TGACCGGCTC CGCCCGCGGC CGCCCCGCAG CTAGCCCGGC 180 

GCTCTCGCCG GCCACACGGA GCGGCGCCCG GGAGCTATGA GCCATGAAGC CGCCCGGCAG 240 

CAGCTCGCGG CAGCCGCCCC TGGCGGGCTG CAGCCTTGCC GGCGCTTCCT GCGGCCCCCA 300 

ACGCGGCCCC GCCGGCTCGG TGCCTGCCAG CGCCCCGGCC CGCACGCCGC CCTGCOGCCT 360 

GCTTCTCGTC CTTCTCCTGC TGCCTCCGCT CGCCGCCTCG TCCCGGCCCC GCGCCTGGGG 420 

GGCTGCTGCG CCCAGCGCTC CGCATTGGAA TGAAACTGCA GAAAAAAATT TGGGAGTCCT 480 

GGCAGATGAA GACAATACAT TGCAACAGAA TAGCAGCAGT AATATCAGTT ACAGCAATGC 540 

AATGCAGAAA GAAATCACAC TGCCTTCAAG ACTCATATAT TACATCAACC AAGACTCGGA 600 
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AAGCCCTTAT CACGTTCTTG ACACAAAGGC AAGACACCAG CAAAAACATA ATAAGGCTGT 660 

CCATCTGGCC CAGGCAAGCT TCCAGATTGA AGCCTTCGGC TCCAAATTCA TTCTTGACCT 720 

CATACTGAAC AATGGTTTGT TGTCTTCTGA TTATGTGGAG ATTCACTACG AAAATGGGAA 780 

ACCACAGTAC TCTAAGGGTG GAGAGCACTG TTACTACCAT GGAAGCATCA GAGGCGTCAA 840 

AGACTCCAAG GTGGCTCTGT CAACCTGCAA TGGACTTCAT GGCATGTTTG AAGATGATAC 900 

CTTCGTGTAT ATGATAGAGC CACTAGAGCT GGTTCATGAT GAGAAAAGCA CAGGTCGACC 960 

ACATATAATC CAGAAAACCT TGGCAGGACA GTATTCTAAG CAAATGAAGA ATCTCACTAT 1020 

GGAAAGAGGT GACCAGTGGC CCTTTCTCTC TGAATTACAG TGGTTGAAAA GAAGGAAGAG 1080 

AGCAGTGAAT CCATCACGTG GTATATTTGA AGAAATGAAA TATTTGGAAC TTATGATTGT 1140 

TAATGATCAC AAAACGTATA AGAAGCATCG CTCTTCTCAT GCACATACCA ACAACTTTGC 1200 

AAAGTCCGTG GTCAACCTTG TGGATTCTAT TTACAAGGAG CAGCTCAACA CCAGGGTTGT 1260 

CCTGGTGGCT GTAGAGACCT GGACTGAGAA GGATCAGATT GACATCACCA CCAACCCTGT 1320 

GCAGATGCTC CATGAGTTCT CAAAATACCG GCAGCGCATT AAGCAGCATG CTGATGCTGT 1380 

GCACCTCATC TCGOGGGTGA CATTTCACTA TAAGAGAAGC AGTCTGAGTT ACTTTGGAGG 1440 

TGTCTGTTCT CGCACAAGAG GAGTTGGTGT GAATGAGTAT GGTCTTCCAA TGGCAGTGGC 1500 

ACAAGTATTA TCGCAGAGCC TGGCTCAAAA CCTTGGAATC CAATGGGAAC CTTCTAGCAG 1S60 

AAAGCCAAAA TGTGACTGCA CAGAATCCTG GGGTGGCTGC ATCATGGAGG AAACAGGGGT 1620 

GTCCCATTCT CGAAAATTTT CAAAGTGCAG CATTTTGGAG TATAGAGACT TTTTACAGAG 1680 

AGGAGGTGGA GCCTGCCTTT TCAACAGGCC AACAAAGCTA TTTGAGCCCA CGGAATGTGG 1740 

AAATGGATAC GTGGAAGCTG GGGAGGAGTG TGATTGTGGT TTTCATGTGG AATGCTATGG 1800 

ATTATGCTGT AAGAAATGTT CCCTCTCCAA CGGGGCTCAC TGCAGCGACG GGCCCTGCTG 1860 

TAACAATACC TCATGTCTTT TTCAGCCACG AGGGTATGAA TGCCGGGATG CTGTGAACGA 1920 

GTGTGATATT ACTGAATATT GTACTGGAGA CTCTGGTCAG TGCCCACCAA ATCTTCATAA 1980 

GCAAGACGGA TATGCATGCA ATCAAAATCA GGGCCGCTGC TACAATGGCG AGTGCAAGAC 2040 

CAGAGACAAC CAGTGTCAGT ACATCTGGGG AACAAAGGCT GCAGGGTCTG ACAAGTTCTG 2100 

CTATGAAAAG CTGAATACAG AAGGCACTGA GAAGGGAAAC TGCGGGAAGG ATGGAGACCG 2160 

GTGGATTCAG TGCAGCAAAC ATGATGTGTT CTGTGGATTC TTACTCTGTA CCAATCTTAC 2220 

TCGAGCTCCA CGTATTGGTC AACTTCAGGG TGAGATCATT CCAACTTCCT TCTACCATCA 2280 

AGGCCGGGTG ATTGACTGCA GTGGTGCCCA TGTAGTTTTA GATGATGATA CGGATGTGGG 2340 

CTATGTAGAA GATGGAACGC CATGTGGCCC GTCTATGATG TGTTTAGATC GGAAGTGCCT 2400 

ACAAATTCAA GCCCTAAATA TGAGCAGCTG TCCACTCGAT TCCAAGGGTA AAGTCTGTTC 2460 

GGGCCATGGG GTGTGTAGTA ATGAAGCCAC CTGCATTTGT GATTTCACCT GGGCAGGGAC 2520 

AGATTGCAGT ATCCGGGATC CAGTTAGGAA CCTTCACCCC CCCAAGGATG AAGGACCCAA 2580 

GGGTCCTAGT GCCACCAATC TCATAATAGG CTCCATCGCT GGTGCCATCC TGGTAGCAGC 2640 

TATTGTCCTT GGGGGCACAG GCTGGGGATT TAAAAATGTC AAGAAGAGAA GGTTCGATCC 2700 

TACTCAGCAA GGCCCCATCT GAATCAGCTG CGCTGGATGG ACACCGCCTT GCACTGTTGG 2760 

ATTCTGGGTA TGACATACTC GCAGCAGTGT TACTGGAACT ATTAAGTTTG TAAACAAAAC 2820 

CTTTGGGTGG TAATGACTAC GGAGCTAAAG TTGGGGTGAC AAGGATGGGG TAAAAGAAAA 2880 

CTGTCTCTTT TGGAAATAAT GTCAAAGAAC ACCTTTCACC ACCTGTCAGT AAACGGGGGA 2940 

GGGGGCAAAA GACCATGCTA TAAAAAGAAC TGTTCCAGAA TCTTTTTTTT TCCCTAATGG 3000 
ACGAAGGAAC AACACACACA CAAAAATTAA ATGCAATAAA GGAATCATTA AAAA 

Seq ID NO: 325 Protein sequence i 
Protein Accession ft: NP_003B03 

1 11 21 31 41 51 

I I I 1 I I 

MKPPGSSSRQ PPLAGCSltAG ASCGPQRGPA GSVPASAPAR TPPCRLLLVL LLLPPLAASS 60 

RPRAWGAAAP SAPHWNETAE KNLGVLADED NTLQQNSSSN ISYSNAMQKE ITUSRLIYY 120 

INQDSESPYH VLDTKARHQQ KHNKAVHLAQ ASFQIEAFGS KFILDLILNN GLLSSDYVEI 180 

HYENGKPQYS KGGEHCYYHG SIRGVKDSKV ALSTCNGLHG MFEDDTPVYM ISPLELVHDE 240 

KSTGRPHIIQ KTLAGQYSKQ MKNLTMERGD QWPFLSELQW LKRRKRAVNP SRGIFEEMKY 300 

LELMIVNDHK TYKKKRSSHA KTNNFAKSW NLVDSIYKEQ LNTRWLVAV ETWTEKDQID 360 

ITTNPVQMLH EFSKYRQRIK QHADAVHLIS RVTPHYKRSS LSYFGGVCSR TRGVGVNEYG 420 

LPMAVAQVLS QSLAQNLGIQ WEPSSRKPKC ' DCTESWGGCI MEETGVSHSR KFSKCSILEY 480 

RDFLQRGGGA CLFNRPTKLF EPTECGNGYV EAGEECDCGF HVECYGLCCK KCSLSNGAHC 540 

SDGPCCNNTS CLFQPRGYEC RDAVNECDIT EYCTGDSGQC PPNLHKQDGY ACNQNQGRCY 600 

NGECKTRDNQ OQYIWGTKAA GSDKFCYEKL NTEGTEKGNC GKDGDRWIQC SKHDVFCGFI* 660 

LCTNLTRAPR IGQLQGEIIP TSFYHQGRVI DCSGAHWLD DDTDVGYVED GTPCGPSMMC 720 

LDRKCLQIQA LNMSSCPLDS KGKVCSGHGV CSNEATCICD FTWAGTDCSI RDPVRNLHPP 780 
KDEGPKGPSA TNLIIGSIAG AILVAAIVLG GTGWGFKNVK KRRFDPTQQG PI 

Seq ID NO: 326 DNA sequence 

Nucleic Acid Accession ft: AK074418.1 

Coding sequence: 244-1515 

1 11 21 31 41 51 

I I I I I I 

CTTTCTCCAA GAGGGCCGGC CATGCTCTCC TCCTCTGCCA GTCTCCTCCA CCACTCTCTA 60 

ACCTGAGAGC CTGTGGAACC TGCCCGTCTC CCCTCCTCCA TCAGACACAC CTGCCTAGGA 120 

AACAGATGGA AAAAGTGAGG GACCGGTGAG TGACTTGCTG CTAAAGTTTA TACCAGATGC 180 

AAATGACAGA GCTGGAGTTC TGCTGTGCCT GGAAAGGACC TCGGAAGTCT TCTAAGGAGA 240 

GTCATGGCGT ATTACCAGGA GCCTTCAGTG GAGACCTCCA TCATCAAGTT CAAAGACCAG 300 

GACTTTACCA CCTTGCGGGA TCACTGCCTG AGCATGGGCC GGACGTTTAA GGATGAGACA 360 

TTCCCCGCAG CAGATTCTTC CATAGGCCAG AAGCTGCTCC AGGAAAAACG CCTCTCCAAT 420 

GTGATATGGA AGCGGCCACA GGATCTACCA GGGGGTCCTC CTCACTTCAT CCTGGATGAT 480 

ATAAGCAGAT TTGACATCCA ACAAGGAGGC GCAGCTGACT GCTGGTTCCT GGCAGCACTG 540 

GGATCCTTGA CTCAGAACCC ACAGTACAGG CAGAAGATCC TGATGGTCCA AAGCTTTTCA 600 

CACCAGTATG CTGGCATTTT CCGTTTCCGG TTCTGGCAAT GTGGCCAGTG GGTGGAAGTG 660 

GTGATTGATG ACCGCCTACC TGTCCAGGGA GATAAATGCC TCTTTGTGCG TCCTCGCCAC 720 

CAAAACCAAG AGTTCTGGCC CTGCCTGCTG GAGAAGGCCT ATGCCAAGCT GCTCGGATCC 780 

TATTCCGATC TGCACTATGG CTTCCTCGAG GATGCCCTGG TGGACCTCAC AGGAGGCGTG 840 

ATCACCAACA TCCATCTGCA CTCTTCCCCT GTGGACCTGG TGAAGGCAGT GAAGACAGCG 900 

ACCAAGGCAG GCTCCCTGAT AACCTGTGCC ACTCCAAGTG GGCCAACAGA TACAGCACAG 960 

GCGATGGAGA ATGGGCTGGT GAGTCTCCAT GCCTACACTG TGACTGGGGC TGAGCAGATT 1020 

CAATACCGAA GGGGCTGGGA AGAAATTATC TCCCTGTGGA ACCCCTGGGG CTGGGGCGAG 1080 

ACCGAATGGA GAGGGCGCTG GAGTGATGGG TCTCAGGAGT GGGAGGAAAC CTGTGATCCG 1140 
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CGGAAAAGCC AGCTACATAA GAAACGGGAA GATGGCGAGT TTTGGATGTC GTGTCAAGAT 1200 

TTCCAACAGA AATTCATCGC CATGTTTATA TGTAGCGAAA TTCCAATTAC CCTGGACCAT 1260 

GGAAACACAC TCCACGAAGG ATGGTCCCAA ATAATGTTTA GGAAGCAAGT GATTCTAGGA 1320 

AACACTGCAG GAGQACCTCQ GAATQATGCT CAATTCAACT TCTCTGTGCA AGAGCCAATG 1380 

GAAGGCACCA ATGTTGTCGT GTGCGTCACA GTTGCTGTCA CACCATCAAA TTTGAAAGCA 1440 

GAAGATGCAA AATTTCCACT CGATTTCCAA GTGATTCTGG CTGGCTCACA GAAACACTGT 1500 

CCAAAGCTCA AATAATAAAT TCCGCCGCAA CTTCACCATG ACTTACCATC TGAGCCCTGG 1560 

GAACTATGTT GTGGTTGCAC AGACACGGAG AAAATCAGCG GAGTTCTTGC TCOGAATCTT 1620 

CCTGAAAATG CCAGACAGTG ACAGGCACCT GAGCAGCCAT TTCAACCTCA GAATGAAGGG 1680 

AAGCCCTTCA GAACATGGCT CCCAACAAAG CATTTTCAAC AGATATGCTC AGCAGGTATG 1740 

GTACCTAGCA CCCAGGGGCC TTACGTGGGA TTGGAGAAAG GGGACCTGAG GGAGGGACAG 1800 

CCCTCACAGG CCCTTACTGG GATGCAGAGA GGAGAAGTGA CTTGATGGAC TATTTTACCT 1860 

GCCTCTCTTC CTGGATCGTC TCCAGAACTG CTGTGGCTGC CAAGCTCGGT AGAGACGTGG 1920 

CGCCCCACCC AGTCTCATCC GGGGGACTTC AAGCTGGAAT GCAGAGCTTA GAAAGGGAGG 1980 

GGATAATTAT GGGGTGTGAG GTGCATTGCC CTCTAAATCT TTAAACAAGC AATTGGCAGT 2040 

ACCCCGTGAA ACCTTTCCTT CTCCTACTCG- GCCACCTCCC ACCAACCTGG CATCGTTCCT 2100 

CCCGGGAGCT AGCCAGCTTC AGAAAGCACA TACAGCATCC TTGCTGCCAA ACCACCTATG 2160 

TGCACACAGG ATTTCCTTAA TGGCTTAATA AACTGTTATA AAGAACTCCT TGACTTGTCA 2220 

GAATAAAATA GCTGCCAGGG GCTCTGCACA ATGAGCCTCT TACCGTTAAA AAAAAAAAAA 2280 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 

Seq ID NO: 327 Protein sequence: 
Protein Accession #: BA685075.1 

1 11 21 31 41 51 

11)111 

MAYYQEPSVE TSIIKPKDQD FTTLRDHCLS MGRTFKDETF PAADSSIGQK LLCjEKRLSNV 60 

IWKRPQDLPG GPPHFILDDI SRFDIQQGGA ADCWFLAALG SLTQNPQYRQ KILMVQSFSH 120 

QYAGIFRFRF WQCGQWVEW IDDRLPVQGD KCLFVRPRHQ NQEFWPCLLE KAYAKLLGSY 180 

9DLHYGFLED ALVDLTGGVI TNIHLHSSPV DLVKAVKTAT KAGSLITCAT PSGPTDTAQA 240 

MENGLVSLHA YTVTGAEQIQ YRRGWEEIIS LWNPWGWGET EWRGRWSDGS QEWEETCDPR 300 

KSQLHKKRED GEFWMSCQDF QQKFIAMPIC SEIPITLDRG NTLHEGWSQI MFRKQVILGN 360 

TAGGPRNDAQ FNFSVQEPME GTNVWCVTV AVTPSNLKAE DAKFPLDFQV ILAGSQKHCP 420 
KLK 



Seq ID NO: 328 DNA sequence 

Nucleic Acid Accession ft: BC017490.1 

Coding sequence: 74-2788 

1 11 21 31 41 51 

t I I I I I 

GTGGGTCACG TGAACCACTT TTCGCGCGAA ACCTGGTTGT TGCTGTAGTG GCGGAGAGGA 60 

TCGTGGTACT GCTATGGCGG AATCATCGGA ATCCTTCACC ATGGCATCCA GCCCGGCCCA 120 

GCGTCGGCGA GGCAATGATC CTCTCACCTC CAGCCCTGGC CGAAGCTCCC GGCGTACTGA 180 

TGCCCTCACC TCCAGCCCTG GCCGTGACCT TCCACCATTT GAGGATGAGT CCGAGGGGCT 240 

CCTAGGCACA GAGGGGCCCC TGGAGGAAGA AGAGGATGGA GAGGAGCTCA TTGGAGATGG 300 

CATGGAAAGG GACTACCGCG CCATCCCAGA GCTGGACGCC TATGAGGCCG AGGGACTGGC 360 

TCTGGATGAT GAGGACGTAG AGGAGCTGAC GGCCAGTCAG AGGGAGGCAG CAGAGCGGGC 420 

CATGCGGCAG CGTGACCGGG AGGCTGGCCG GGGCCTGGGC CGCATGCGCC GTGGGCTCCT 480 

GTATGACAGC GATGAGGAGG ACGAGGAGCG CCCTGCCCGC AAGCGCCGCC AGGTGGAGCG 540 

GGCCACGGAG GACGGCGAGG AGGACGAGGA GATGATCGAG AGCATCGAGA ACCTGGAGGA 600 

TCTCAAAGGC CACTCTGTGC GCGAGTGGGT GAGCATGGCG GGCCCCCGGC TGGAGATCCA 660 

CCACCGCTTC AAGAACTTCC TGCGCACTCA CGTCGACAGC CACGGCCACA ACGTCTTCAA 720 

GGAGCGCATC AGCGACATGT GCAAAGAGAA CCGTGAGAGC CTGGTGGTGA ACTATGAGGA 780 

CTTGGCAGCC AGGGAGCACG TGCTGGCCTA CTTCCTGCCT GAGGCACCGG CGGAGCTGCT 840 

GCAGATCTTT GATGAGGCTG CCCTGGAGGT GGTACTGGCC ATGTACCCCA AGTACGACCG 900 

CATCACCAAC CACATCCATG TCCGCATCTC CCACCTGCCT CTGGTGGAGG AGCTGCGCTC 960 

GCTGAGGCAG CTGCATCTGA ACCAGCTGAT CCGCACCAGT GGGGTGGTGA CCAGCTGCAC 1020 

TGGCGTCCTG CCCCAGCTCA GCATGGTCAA GTACAACTGC AACAAGTGCA ATTTCGTCCT 1080 

GGGTCCTTTC TGCCAGTCCC AGAACCAGGA GGTGAAACCA GGCTCCTGTC CTGAGTGCCA 1140 

GTCGGCCGGC CCCTTTGAGG TCAACATGGA GGAGACCATC TATCAGAACT ACCAGCGTAT 1200 

CCGAATCCAG GAGAGTCCAG GCAAAGTGGC GGCTGGCCGG CTGCCCCGCT CCAAGGACGC 1260 

CATTCTCCTC GCAGATCTGG TGGACAGCTG CAAGCCAGGA GACGAGATAG AGCTGACTGG 1320 

CATCTATCAC AACAACTATG ATGGCTCCCT CAACACTGCC AATGGCTTCC CTGTCTTTGC 1380 

CACTGTCATC CTAGCCAACC ACGTGGCCAA GAAGGACAAC AAGGTTGCTG TAGGGGAACT 1440 

GACCGATGAA GATGTGAAGA TGATCACTAG CCTCTCCAAG GATCAGCAGA TCGGAGAGAA 1500 

GATCTTTGCC AGCATTGCTC CTTCCATCTA TGGTCATGAA GACATCAAGA GAGGCCTGGC 1560 

TCTGGCCCTG TTCGGAGGGG AGCCCAAAAA CCCAGGTGGC AAGCACAAGG TACGTGGTGA 1620 

TATCAACGTG CTCTTGTGCG GAGACCCTGG CACAGCGAAG TCGCAGTTTC TCAAGTATAT 1680 

TGAGAAAGTG TCCAGCCGAG CCATCTTCAC CACTGGCCAG GGGGCGTCGG CTGTGGGCCT 1740 

CACGGCGTAT GTCCAGCGGC ACCCTGTCAG CAGGGAGTGG ACCTTGGAGG CTGGGGCCCT 1800 

GGTTCTGGCT GACCGAGGAG TGTGTCTCAT TGATGAATTT GACAAGATGA ATGACCAGGA 1860 

CAGAACCAGC ATCCATGAGG CCATGGAGCA ACAGAGCATC TCCATCTCGA AGGCTGGCAT 1920 

CGTCACCTCC CTGCAGGCTC GCTGCACGGT CATTGCTGCC GCCAACCCCA TAGGAGGGCG 1980 

CTACGACCCC TCGCTGACTT TCTCTGAGAA CGTGGACCTC ACAGAGCCCA TCATCTCACG 2040 

CTTTGACATC CTGTGTGTGG TGAGGGACAC CGTGGACCCA GTCCAGGACG AGATGCTGGC 2100 

CGGCTTCGTG GTGGGCAGCC ACGTCAGACA CCACCCCAGC AACAAGGAGG AGGAGGGGCT 2160 

GGCCAATGGC AGCGCTGCTG AGCCCGCCAT GCCCAACACG TATGGCGTGG AGCCCCTGCC 2220 

CCAGGAGGTC CTGAAGAAGT ACATCATCTA CGCCAAGGAG AGGGTCCACC CGAAGCTCAA 2280 

CCAGATGGAC CAGGACAAGG TGGCCAAGAT GTACAGTGAC CTGAGGAAAG AATCTATGGC 2340 

GACAGGCAGC ATCCCCATTA CGGTGCGGCA CATCGAGTCC ATGATCCGCA TGGCGGAGGC 2400 

CCACGCGCGC ATCCATCTGC GGGACTATGT GATCGAAGAC GACGTCAACA TGGCCATCCG 2460 

CGTGATGCTG GAGAGCTTCA TAGACACACA GAAGTTCAGC GTCATGCGCA GCATGCGCAA 2520 

GACTTTTGCC OGCTACCTTT CATTCCGGOG TGACAACAAT GAGCTGTTGC TCTTCATACT 2580 

GAAGCAGTTA GTGGCAGAGC AGGTGACATA TCAGCGCAAC CGCTTTGGGG CCCAGCAGGA 2640 

CACTATTGAG GTCCCTGAGA AGGACTTGGT GGATAAGGCT CGTCAGATCA ACATCCACAA 2700 
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CCTCTCTGCA TTTTATGACA GTGAGCTCTT CAGGATGAAC AAGTTCAGCC ACGACCTGAA 2760 

AAGGAAAATG ATCCTGCAGC AGTTCTGAGG CCCTATGCCA TCCATAAGGA TTCCTTGGGA 2820 

TTCTGGTTTG GGGTGGTCAG TGCCCTCTGT GCTTTATGGA CACAAAACCA GAGCACTTGA 2880 

TGAACTCGGG GTACTAGGGT CAGGGCTTAT AGCAGGATGT CTGGCTGCAC CTGGCATGAC 2940 

TGTTTGTTTC TCCAAGCCTG CTTTGTGCTT CTCACCTTTG GGTGGGATGC CTTGCCAGTG 3000 

TGTCTTACTT GGTTGCTGAA CATCTTGCCA CCTCOGAGTG CTTTGTCTCC ACTCAGTACC 3060 

TTGGATCAGA GCTGCTGAGT TCAGGATGCC TGCGTGTGGT TTAGGTGTTA GCCTTCTTAC 3120 

ATGGATGTCA GGAGAGCTGC TGCCCTCTTG GCGTGAGTTG CGTA TTCAGG CTGCTTTTGC 3180 

TGCCTTTGGC CAGAGAGCTG GTTGAAGATG TTTGTAATCG TTTTCAGTCT CCTGCAGGTT 3240 

TCTGTGCCCC TGTGGTGGAA GAGGGCACGA CAGTGCCAGC GCAGCGTTCT GGGCTCCTCA 3300 

GTCGCAGGGG TGGGATGTGA GTCATGCGGA TTATCCACTC GCCACAGTTA TCAGCTGCCA 3360 

TTGCTCCCTG TCTGTTTCCC CACTCTCTTA TTTGTGCATT CGGTTTGGTT TCTGTAGTTT 3420 
TAATTTTTAA TAAAGTTGAA TAAAATATAA AAAAAAAAAA AAAAAA 

Seq ID NO: 329 Protein Bequence: 
Protein Accession fh AAH17490.X 

1 11 21 31 41 51 

I I I I I I 

MAESSESFTM ASSPAQRRRG NDPLTSSPGR SSRRTDALTS SPGRDLPPFE DESEGLLGTE 60 

GPLEEEEDGE ELIGDGMERD YRAIPELDAY EAEGLALDDE DVEELTASQR EAAERAMRQR 120 

DREAGRGLGR MRRGLLYDSD EEDEERPARK RRQVERATED GEEOEEMIES IENLEDLKGH 180 

SVREWVSMAG PRLEIHHRFK NFLRTHVDSH GHNVFKERIS DMCKENRESL WNYEOLAAR 240 

EHVLAYFLPE APAELLQIFD EAALEWLAM YPKYDRITNH IHVRISHLPL VEELRSLRQL 300 

HLNQLIRTSG WTSCTGVLP QLSMVKYNCN KCNFVLGPFC QSQNQEVKPG SCPECQSAGP 360 

FEVNMEETIY QNYQRIRIQE SPGKVAAGRL PRSKDAILLA DLVDSCKPGD EIELTGIYHN 420 

NYDGSLNTAN GFPVPATVIL ANHVAKKDNK VAVGELTDED VKMITSLSKD QQIGEKIFAS 480 

IAPSIYGHED IKRGIiALALF GGEPKNPGGK HKVRGDINVL LCGDPGTAKS QFLKYIEKVS 540 

SRAIFTTGQG ASAVGLTAYV QRHPVSREWT LEAGALVLAD RGVCLIDEFD KMNDQDRTSI 600 

HEAMEQQSIS ISKAGIVTSL QARCTVIAAA NPIGGRYDPS LTFSENVDLT EPIISRFDIL 660 

CWRDTVDPV QDEMLARFW GSHVRHHPSN KEEEGLANGS AAEPAMPNTY GVEPLPQEVIj 720 

KKYIIYAKER VHPKLNQMDQ DKVAKMYSDL RKBSMATGSI PITVRHIESM 1RMAEAHARI 780 

HLRDYVIEDD VNMAIRVMIiE SFIDTQKFSV MRSMRKTFAR YLSFRRDNNE LLLFILKQLV 840 

AEQVTYQRNR FGAQQDTIEV PEKDLVDKAR QINIHNLSAF YDSELFRMNK FSHDLKRKMI 900 
LQQF 

Seq ID NO: 330 DMA sequence 
Nucleic Acid Accession ft: M17254 
Coding sequence: 257-1645 

1 11 21 31 41 51 

I I I I I I 

GTCCGCGCGT GTCCGCGCCC GCGTGTGCCA GCGCGCGTGC CTTGGCCGTG CGCGCOGAGC 60 

CGGGTCGCAC TAACTCCCTC GGCGCCGACG GCGGCGCTAA CCTCTCGGTT ATTCCAGGAT 120 

CTTTGGAGAC CCGAGGAAAG CCGTGTTGAC CAAAAGCAAG ACAAATGACT CACAGAGAAA 180 

AAAGATGGCA GAACCAAGGG CAACTAAAGC CGTCAGGTTC TGAACAGCTG GTAGATGGGC 240 

TGGCTTACTG AAGGACATGA TTCAGACTGT CCCGGACCCA GCAGCTCATA TCAAGGAAGC 300 

CTTATCAGTT GTGAGTGAGG ACCAGTCGTT GTTTGAGTGT GCCTACGGAA CGCCACACCT 360 

GGCTAAGACA GAGATGACCG CGTCCTCCTC CAGCGACTAT GGACAGACTT CCAAGATGAG 420 

CCCACGCGTC CCTCAGCAGG ATTGGCTGTC TCAACCCCCA GCCAGGGTCA CCATCAAAAT 480 

GGAATGTAAC CCTAGCCAGG TGAATGGCTC AAGGAACTCT CCTGATGAAT GCAGTGTGGC 540 

CAAAGGCGGG AAGATGGTGG GCAGCCCAGA CACCGTTGGG ATGAACTACG GCAGCTACAT 600 

GGAGGAGAAG CACATGCCAC CCCCAAACAT GACCACGAAC GAGCGCAGAG TTATCGTGCC 660 

AGCAGATCCT ACGCTATGGA GTACAGACCA TGTGCGGCAG TGGCTGGAGT GGGCGGTGAA 720 

AGAATATGGC CTTCCAGACG TCAACATCTT GTTATTCCAG AACATCGATG GGAAGGAACT 780 

GTGCAAGATG ACCAAGGACG ACTTCfCAGAG GCTCACCCCC AGCTACAACG CCGACATCCT 840 

TCTCTCACAT CTCCACTACC TCAGAGAGAC TCCTCTTCCA CATTTGACTT CAGATGATGT 900 

TGATAAAGCC TTACAAAACT CTCCACGGTT AATGCATGCT AGAAACACAG ATTTACCATA 960 

TGAGCCCCCC AGGAGATCAG CCTGGACCGG TCACGGCCAC CCCACGCCCC AGTCGAAAGC 1020 

TGCTCAACCA TCTCCTTCCA CAGTGCCCAA AACTGAAGAC CAGCGTCCTC AGTTAGATCC 1080 

TTATCAGATT CTTGGACCAA CAAGTAGCCG CCTTGCAAAT CCAGGCAGTG GCCAGATCCA 1140 

GCTTTGGCAG TTCCTCCTGG AGCTCCTGTC GGACAGCTCC AACTCCAGCT GCATCACCTG 1200 

GGAAGGCACC AACGGGGAGT TCAAGATGAC GGATCCCGAC GAGGTGGCCC GGCGCTGGGG 1260 

AGAGCGGAAG AGCAAACCCA ACATGAACTA CGATAAGCTC AGCCGCGCCC TCCGTTACTA 1320 

CTATGACAAG AACATCATGA CCAAGGTCCA TGGGAAGCGC TACGCCTACA AGTTCGACTT 1380 

CCACGGGATC GCCCAGGCCC TCCAGCCCCA CCCCCCGGAG TCATCTCTGT ACAAGTACCC 1440 

CTCAGACCTC CCGTACATGG GCTCCTATCA CGCCCACCCA CAGAAGATGA ACTTTGTGGC 1500 

GCCCCACCCT CCAGCCCTCC CCGTGACATC TTCCAGTTTT TTTGCTGCCC CAAACCCATA 1560 

CTGGAATTCA CCAACTGGGG GTATATACCC CAACACTAGG CTCCCCACCA GCCATATGCC 1620 

TTCTCATCTG GGCACTTACT ACTAAAGACC TGGCGGAGGC TTTTCCCATC AGCGTGCATT 1680 

CACCAGCCCA TCGCCAGAAA CTCTATCGGA GAACATGAAT CAAAAGTGCC TCAAGAGGAA 1740 

TGAAAAAAGC TTTACTGGGG CTGGGGAAGG AAGCCGGGGA AGAGATCCAA AGACTCTTGG 1800 

GAGGGAGTTA CTGAAGTCTT ACTACAGAAA TGAGGAGGAT GCTAAAAATG TCACGAATAT 1860 

GGACATATCA TCTGTGGACT GACCTTGTAA AAGACAGTGT ATGTAGAAGC ATGAAGTCTT 1920 

AAGGACAAAG TGCCAAAGAA AGTGGTCTTA AGAAATGTAT AAACTTTAGA GTAGAGTTTG 1980 

AATCCCACTA ATGCAAACTG GGATGAAACT AAAGCAATAG AAACAACACA GTTTTGACCT 2040 

AACATACCGT TTATAATGCC ATTTTAAGGA AAACTACCTG TATTTAAAAA TAGTTTCATA 2100 

TCAAAAACAA GAGAAAAGAC AOGAGAGAGA CTGTGGCCCA TCAACAGACG TTGATATGCA 2160 

ACTGCATGGC ATGTGCTGTT TTGGTTGAAA TCAAATACAT TCCGTTTGAT GGACAGCTGT 2220 

CAGCTTTCTC AAACTGTGAA GATGACCCAA AGTTTCCAAC TCCTTTACAG TATTACCGGG 2280 

ACTATGAACT AAAAGGTGGG ACTGAGGATG TGTATAGAGT GAGCGTGTGA TTGTAGACAG 2340 

AGGGGTGAAG AAGGAGGAGG AAGAGGCAGA GAAGGAGGAG ACCAGGCTGG GAAAGAAACT 2400 

TCTCAAGCAA TGAAGACTGG ACTCAGGACA TTTGGGGACT GTGTACAATG AGTTATGGAG 2460 

ACTCGAGGGT TCATGCAGTC AGTGTTATAC CAAACCCAGT GTTAGGAGAA AGGACACAGC 2520 

GTAATGGAGA AAGGGAAGTA GTAGAATTCA GAAACAAAAA TGCGCATCTC TTTCTTTGTT 2580 

TGTCAAATGA AAATTTTAAC TGGAATTGTC TGATATTTAA GAGAAACATT CAGGACCTCA 2640 

TCATTATGTG GGGGCTTTGT TCTCCACAGG GTCAGGTAAG AGATGGCCTT CTTGGCTGCC 2700 
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ACAATCAOAA ATCACGCAGG CATTTTGGGT AGGCGGCCTC CAGTTTTCCT TTGAGTCGCG 2760 

AACGCTGTGC GTTTGTCAGA ATGAAGTATA CAAGTCAATG TTTTTCCCCC TTTTTATATA 2820 

ATAATTATAT AACTTATGCA TTTATACACT ACGAGTTGAT CTCGGCCAGC CAAAGACACA 2880 

CGACAAAAGA GACAATCGAT ATAATGTGGC CTTGAATTTT AACTCTGTAT GCTTAATGTT 2940 

TACAATATGA AGTTATTAGT TCTTAGAATG CAGAATGTAT GTAATAAAAT AAGCTTGGCC 3000 

TAGCATGGCA AATCAGATTT ATACAGGAGT CTGCATTTGC ACTTTTTTTA GTGACTAAAG 3060 

TTGCTTAATG AAAACATGTG CTGAATGTTG TGGATTTTGT GTTATAATTT ACTTTGTCCA 3120 
GGAACTTGTG CAAGGGAGAG CCAAGGAAAT AGGATGTTTG GCACCC 

Seq ID MO: 331 Protein sequence 
Protein Accession fh AAA52398 

1 11 21 31 41 51 

| I I I I I 

MIQTVPDPAA HIKEALSWS EDQSLFECAY GTPHLAKTEM TASSSSDYGQ TSKMSPRVPQ 60 

QDWLSQPPAR VTIKMECNPS QVNGSRNSPD ECSVAKGGKM VGSPDTVGMN YGSYMEEKHM 120 

PPPNMTTNBR RVIVPADPTL WSTDHVRQWL EWAVKEYGLP DVNILLFQNI DGKELCKMTK 180 

DDFQRLTPSY NADILLSHLH YLRETPLPHL TSDDVDKALQ NSPRLMHARN TDLPYEPPRR 240 

SAWTGHGHPT PQSKAAQPSP STVPKTEDQR PQLDPYQILG PTSSRLANPG SGQIQLWQFL 300 

LELLSDSSNS SCITWBGTNG EFKMTDPDEV ARRWGERKSK PNMNYDKLSR ALRYYYDKNI 360 

MTKVHGKRYA YKFDFHGIAQ ALQPHPPESS LYKYPSDLPY MGSYHAHPQK MNFVAPHPPA 420 

LPVTSSSFFA APNPYWNSPT GGIYPNTRLP TSHMPSHLGT YY 462 

Seq ID NO: 332 DNA sequence 
Nucleic Acid Accession #: NM_000020 
Coding sequence: 283-1794 

1 li 21 31 41 • " 51 

I i I I I I 

AGGAAACGGT TTATTAGGAG GGAGTGGTGG AGCTGGGCCA GGCAGGAAGA CGCTGGAATA 60 

'AGAAACATTT TTGCTCCAGC CCCCATCCCA GTCCCGGGAG GCTGCCGOGC CAGCTGCGCC 120 

GAGCGAGCCC CTCCCCGGCT CCAGCCCGGT CCGGGGCCGC GCCGGACCCC AGCCCGCCGT 180 

CCAGCGCTGG CGGTGCAACT GCGGCCGCGC GGTGGAGGGG AGGTGGCCCC GGTCCGCCGA 240 

AGGCTAGCGC CCCGCCACCC GCAGAGCGGG CCCAGAGGGA CCATGACCTT GGGCTCCCCC 300 

AGGAAAGGCC TTCTGATGCT GCTGATGGCC TTGGTGACCC AGGGAGACCC TGTGAAGCCG 360 

TCTCGGGGCC CGCTGGTGAC CTGCACGTGT GAGAGCCCAC ATTGCAAGGG GCCTACCTGC 420 

CGGGGGGCCT GGTGCACAGT AGTGCTGGTG CGGGAGGAGG GGAGGCACCC CCAGGAACAT 480 

CGGGGCTGCG GGAACTTGCA CAGGGAGCTC TGCAGGGGGC GCCCCACCGA GTTCGTCAAC 540 

CACTACTGCT GCGACAGCCA CCTCTGCAAC CACAACGTGT CCCTGGTGCT GGAGGCCACC 600 

CAACCTCCTT CGGAGCAGCC GGGAACAGAT GGCCAGCTGG CCCTGATCCT GGGCCCCGTG 660 

CTGGCCTTGC TGGCCCTGGT GGCCCTGGGT GTCCTGGGCC TGTGGCATGT CCGACGGAGG 720 

CAGGAGAAGC AGCGTGGCCT GCACAGCGAG CTGGGAGAGT CCAGTCTCAT CCTGAAAGCA 780 

TCTGAGCAGG GCGACACGAT GTTGGGGGAC CTCCTGGACA GTGACTGCAC CACAGGGAGT 840 

GGCTCAGGGC TCCCCTTCCT GGTGCAGAGG ACAGTGGCAC GGCAGGTTGC CTTGGTGGAG 900 

TGTGTGGGAA AAGGCCGCTA TGGCGAAGTG TGGCGGGGCT TGTGGCACGG TGAGAGTGTG 960 

GCCGTCAAGA TCTTCTCCTC GAGGGATGAA CAGTCCTGGT TCCGGGAGAC TGAGATCTAT 1020 

AACACAGTAT TGCTCAGACA CGACAACATC CTAGGCTTCA TCGCCTCAGA CATGACCTCC 1080 

CGCAACTCGA GCACGCAGCT GTGGCTCATC ACGCACTACC ACGAGCACGG CTCCCTCTAC 1140 

GACTTTCTGC AGAGACAGAC GCTGGAGCCC CATCTGGCTC TGAGGCTAGC TGTGTCCGCG 1200 

GCATGCGGCC TGGCGCACCT GCACGTGGAG ATCTTCGGTA CACAGGGCAA ACCAGCCATT 1260 

GCCCACCGCG ACTTCAAGAG CCGCAATGTG CTGGTCAAGA GCAACCTGCA GTGTTGCATC 1320 

GCCGACCTGG GCCTGGCTGT GATGCACTCA CAGGGCAGCG ATTACCTGGA CATCGGCAAC 1380 

AACCCGAGAG TGGGCACCAA GCGGTACATG GCACCCGAGG TGCTGGACGA GCAGATCCGC 1440 

ACGGACTGCT TTGAGTCCTA CAAGTGGACT GACATCTGGG CCTTTGGCCT GGTGCTGTGG 1500 

GAGATTGCCC GCCGGACCAT CGTGAATGGC ATCGTGGAGG ACTATAGACC ACCCTTCTAT 1560 

GATGTGGTGC CCAATGACCC CAGCTTTGAG GACATGAAGA AGGTGGTGTG TGTGGATCAG 1620 

CAGACCCCCA CCATCCCTAA CCGGCTGGCT GCAGACCCGG TCCTCTCAGG CCTAGCTCAG 1680 

ATGATGCGGG AGTGCTGGTA CCCAAACCCC TCTGCCCGAC TCACCGCGCT GCGGATCAAG 1740 

AAGACACTAC AAAAAATTAG CAACAGTCCA GAGAAGCCTA AAGTGATTCA ATAGCCCAGG 1800 

AGCACCTGAT TCCTTTCTGC CTGCAGGGGG CTGGGGGGGT GGGGGGCAGT GGATGGTGCC 1860 

CTATCTGGGT AGAGGTAGTG TGAGTGTGGT GTGTGCTGGG GATGGGCAGC TGCGCCTGCC 1920 

TGCTCGGCCC CCAGCCCACC CAGCCAAAAA TACAGCTGGG CTGAAACCTG ATCCCCTGCT 1980 

GTCTGGCCTG CTCAAAGCGG CAGGCTCCCT GACGCCTGGC TCTCTCCCCA CCCCTATGGC 2040 

CAGCATGGTG CACCCCCTAC CACTCCCGGG ACAGGATGCA AAAGAGGCTC CAGAGTCAGA 2100 

GTGCCAAGCC AGGGAATCCC AGTCCCAGAC TCAGAGCCCG GGCCTGCACT TTGCCCCCTG 2160 

CCCTTGATCA ACCCCACTGC CCCACCAGAG CTGCCAGGGT GGCACAGGGC CCTGTCCAGC 2220 

CCCTGGCACA CACTTCCCTG CCAGGCCTCA GCCTCTAGCA TAAGCTCCAG AGAGCCAGGG 2280 

CCCATCAGTT TCTCTCTGTG GATTTGTATC TCAGCTCCAT GATGCCTTGG GCTTTCTGTC 2340 

TCCTCAACAA GAGTGCAGCT TGCTGAATGT CAGCTGCCTG AGAGAGCTGG GGCCTGACTT 2400 

ACTAGGGCAT TAAATCCTAA GAGGTCCTAC TGAGGTGTGG CAGGATCACA GGCCAGTGGA 2460 

AAAAGGGCAG GTCAGATGGG CAAGGCCCAG GACTTTCAGA TTAACTGAGA GGATATCGAG 2520 

GCCAAGCATG GCAGGGGGAA GGTCAGTGGG TGTCAAGAGA CCCAGGTCTG ACCCCGGATG 2580 

TTTGCTCCAT GTGACAAAAG CAGGCCTGTC TCAGGACCTT TTCTTTTCTT TTTTCCTTCT 2640 

TTTTTTTTTT GACACGGAGT TTCGCTCTTG TTGTCCAGGC TAGAGTGCAA TGGCATGATC 2700 

CCAGCTCACC GCAACGTCTA CCTCCCAGGT TCAAATCATT CTCTTGCCTC AGACTCCCGA 2760 

GTAGCTGGGA TTACAGGCAC ATGCCACCAT GCCTGGCTAA TTTTGTATAT TTAGTAGAAA 2820 

CAGGGTTTCA CCATGCTGGC CATGCTGGTT CTCGAACTCC TGACCTCAGG TGTTCCACCT 2880 

ACCTCAGCCT CCCAAAGTGC TGGGGTTACA GGTGTGAGCC ATCGCGCCTG GCCAGGACCT 2940 

TTGTTTCTTA TCTACATATT GGAAGATTTG GTCCTGATGT CCTTTGAGGC TTCTTTAGCT 3000 

CTAGTTCTCT GACACTTCAG CCTATATCAC AGCTAACTTC YTCAGTCTCA TCTATTCCTT 3060 

ATGCTCCAGC CCCTGGCAAT TTGCCTCAAG ATGGGGGTTT GAAAATAACT TTACCTGACT 3120 

CAAGGAGTGT CTGGAGCACC TCCTAGTCTA AGTCTGCAAG CTCCAGTTCT TGCCTAAAAC 3180 

CATGCCAGTG GCCACCCTTG GGCTCAGACA GCTCTGGGCC TTTTGACCAC AAGCCAGCCC 3240 

CTCGCCCTCT CTGTGGCATA GTCTTCTCTG CCCCAGGACT GCAGGGCGGC TTCCTCCAAG 3300 

GCTTCCAAGG CTCAAAAGAA ATTTGGCTCC ATCCAAGAAG GCTCCAGCTC CCCTACTGGC 3360 

CCCTGGCTTC AGGCCCACAC CCCTGGGCCA GGSCCAGAGA GTGTGTCTCA GGAGAATTCA 3420 

ATGGGCTCTA GAGAGACACA CAGAAAGTTT GGGCATTTGG GAAATTTTCA AGGRTGTATG 3480 
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TATGGYTCAC GTATGGWGCA GGTTGTCCTG GTCCYKGGGT GCAGGGAAGT GGGCTGCAGG 3540 

GAAGTGGATT GGAGGGGAGC TTGAGGAATA TAAGGAGCGG GGGTGGAGAC TCAGGCTATG 3600 

GACAAGGACA GCCCCAAGGT TGGGAAGACC TGGCCTTAGT CGTCCTCAGC CTAGGGCAGG 3660 

GCAGTGAAGA AAGCTCTCCC CGCTCCTGCT GTAATGACCC AGAGTAGCCT CCCCAGGCCG 3720 

GCATCTTATG TGTGTCTTCC ACCATCCTCA TGGTGGCACT TTTCTAGGCC TGTCTCCCAG 37B0 

CATTGTGCAA GGCTCGGAAG AGAACCAGGA AGTGAAACTG GGTGAAAACA GAAAGCTCAA 3840 

TGGATGGGCT AGGTTCCCAG ATCATTAGGG CAGAGTTTGC ACGTCCTCTG GTTCACTGGG 3900 

AATCCACCCA GCCCACGAAT CATCTCCCTC TTTGAAGGAT TTTWATTTCT ACTGGGTTTT 3960 

GGAACAAACT CCTGCTGAGA CCCCACAGCC AGAAACTGAA AGCAGCAGCT CCCCAAAGCC 4020 

TGGAAAATCC CTAAGAGAAG GCCTGGGGGA MAGGAAKTGG AGTGACAGGG GACAGGTAGA 4080 

GAGAAGGGGG CCCAATGGCC AGGGAGTGAA GGAGGTGGCG TTGCTGAGAG CAGTCTGCAC 4X40 

ATGCTTCTGT CTGAGTGCAG GAAGGTGTTC CAGGGTCGAA ATTACACTTC TCGTACCTGG 4200 

AGACGCTGTT TGTGGGAGCA CTGGGCTCAT GCCTGGCACA CAATAGGTCT GCAATAAACC 4260 
ATGGTTAAAT CCTGAAAAAA AAAAAAAAA 

Seg ID NO: 333 Protein sequence 
Protein Accession #: NP_0 00011 



1 11 21 31 41 51 

I I I I I I 

MTLGSPRKGL LMLLMALVTQ GDPVKPSRGP LVTCTCESPH CKGPTCRGAW CTWLVREEG 60 

RHPQEHRGCG NLHRELCRGR PTEFVNHYCC DSHLCNHNVS LVLEATQPPS EQPGTDGQLA 120 

LILGPVLALL ALVALGVLGI* WHVRRRQEKQ RGLHSELGES SLILKASEQG DTMLGDLLDS 180 

DCTTGSGSGL PFLVQRTVAR QVALVECVGK GRYGBVWRGL WHGESVAVKI FSSRDEQSWF 240 

RETEIYNTVL LRHDNILGFI ASDMTSRNSS TQLWLITHYH EHGSLYDFLQ RQTLEPHLAL 300 

RLAVSAACGL AHLHVEIFGT QGKPAIAHRD FKSRNVLVKS NLQCCIADLG LAVMHSQGSD 360 

YLDIGNNPRV GTKRYMAPEV LDEQIRTDCF ESYKWTDIWA FGLVLWEIAR RTIVNGIVED 420 

YRPPFYDWP NDPSFEDMKK WCVDQQTPT IPNRLAADPV LSGLAQMMRE CWYPNPSARL 480 
TALRIKKTLQ KISNSPEKPK VIQ 

Seq ID NO: 334 DNA sequence 

Nucleic Acid Accession ft: NM_004126.1 

Coding sequence: 108-329 

1 11 21 • 31 41 51 

I I I I I I 

GGCACGAGCT CGTGCCGGCC TTCAGTTGTT TCGGGACGCG CCGAGCTTCG CCGCTCTTCC 60 
AGCGGCTCCG CTGCCAGAGC TAGCCCGAGC CCGGTTCTGG GGCGAAAATG CCTGCCCTTC 120 
ACATCGAAGA TTTGCCAGAG AAGGAAAAAC TGAAAATGGA AGTTGAGCAG CTTCGCAAAG 180 
AAGTGAAGTT GCAGAGACAA CAAGTGTCTA AATGTTCTGA AGAAATAAAG AACTA TATT G 240 
AAGAACGTTC TGGAGAGGAT CCTCTAGTAA AGGGAATTCC AGAAGACAAG AACCCCTTTA 300 
AAGAAAAAGG CAGCTGTGTT ATTTCATAAA TAACTTGGGA GAAACTGCAT CCTAAGTGGA 360 
AGAACTAGTT TGTTTTAGTT TTCCCAGATA AAACCAACAT GCTTTTTAAG GAAGGAAGAA 420 
TGAAATTAAA AGGAGACTTT CTTAAGCACC ATATAGATAG GGTTATGTAT AAAAGCATAT 480 
GTGCTACTCA TCTTTGCTCA CTATGCAGTC TTTTTTAAGA GAGCAGAGAG TATCAGATGT 540 
ACAATTATGG AAATAAGAAC ATTACTTGAG CATGACACTT CTTTCAGTAT ATTGCTTGAT 600 
GCTTCAAATA AAGTTTTGTC TT 

Seq ID NO: 335 Protein sequence 
Protein Accession #: NP_004117.1 

1 11 21 31 41 51 

MPALHIEDLP EKEKLKMEVE QLRKEVKLQR QQVSKCSEEI KNYIEERSGE DPLVKGIPED 60 
KNPFKEKGSC VIS 



Seq ID NO: 336 DNA sequence 
Nucleic Acid Accession &: NMJ)0S795 
Coding sequence: 555-1940 

1 11 21 31 41 51 

| | I 1 I I 

GCACGAGGGA ACAACCTCTC TCTCTSCAGC AGAGAGTGTC ACCTCCTGCT TTAGGACCAT 60 

CAAGCTCTGC TAACTGAATC TCATCCTAAT TGCAGGATCA CATTGCAAAG CTTTCACTCT 120 

TTCCCACCTT GCTTGTGGGT AAATCTCTTC TGCGGAATCT CAGAAAGTAA AGTTCCATCC 180 

TGAGAATATT TCACAAAGAA TTTCCTTAAG AGCTGGACTG GGTCTTGACC CCTGGAATTT 240 

AAGAAATTCT TAAAGACAAT GTCAAATATG ATCCAAGAGA AAATGTGATT TGAGTCTGGA 300 

GACAATTGTG CATATCGTCT AATAATAAAA ACCCATACTA GCCTATAGAA AACAATATTT 360 

GAATAATAAA AACCCATACT AGCCTATAGA AAACAATATT TGAAAGATTG CTACCACTAA 420 

AAAGAAAACT ACTACAACTT GACAAGACTG CTGCAAACTT CAATTGGTCA CCACAACTTG 480 

ACAAGGTTGC TATAAAACAA GATTGCTACA ACTTCTAGTT TATGTTATAC AGCATATTTC 540 

ATTTGGGCTT AATGATGGAG AAAAAGTGTA CCCTGTATTT TCTGGTTCTC TTGCCTTTTT 600 

TTATGATTCT TGTTACAGCA GAATTAGAAG AGAGTCCTGA GGACTCAATT CAGTTGGGAG 660 

TTACTAGAAA TAAAATCATG ACAGCTCAAT ATGAATGTTA CCAAAAGATT ATGCAAGACC 720 

CCATTCAACA AGCAGAAGGC GTTTACTGCA ACAGAACCTG GGATGGATGG CTCTGCTGGA 780 

ACGATGTTGC AGCAGGAACT GAATCAATGC AGCTCTGCCC TGATTACTTT CAGGACTTTG 840 

ATCCATCAGA AAAAGTTACA AAGATCTGTG ACCAAGATGG AAACTGGTTT AGACATCCAG 900 

CAAGCAACAG AACATGGACA AATTATACCC AGTGTAATGT TAACACCCAC GAGAAAGTGA 960 

AGACTGCACT AAATTTGTTT TACCTGACCA TAATTGGACA CGGATTGTCT ATTGCATCAC 1020 

TGCTTATCTC GCTTGGCATA TTCTTTTATT TCAAGAGCCT AAGTTGCCAA AGGATTACCT 1080 

TACACAAAAA TCTGTTCTTC TCATTTGTTT GTAACTCTGT TGTAACAATC ATTCACCTCA 1140 

CTGCAGTGGC CAACAACCAG GCCTTAGTAG CCACAAATCC TGTTAGTTGC AAAGTGTCCC 1200 

AGTTCATTCA TCTTTACCTG ATGGGCTGTA ATTACTTTTG GATGCTCTGT GAAGGCATTT 1260 

ACCTACACAC ACTCATTGTG GTGGCCGTGT TTGCAGAGAA GCAACATTTA ATGTGGTATT 1320 

ATTTTCTTGG CTGGGGATTT CCACTGATTC CTGCTTGTAT ACATGCCATT GCTAGAAGCT 1380 
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TATATTACAA TGACAATTGC TGGATCAGTT CTGATACCCA TCTCCTCTAC ATTATCCATG 1440 

GCCCAATTTO TGCTGCTTTA CTGGTGAATC TTTTTTTCTT GTTAAATATT GTACGCGTTC 1S00 

TCATCACCAA GTTAAAAGTT ACACACCAAG CGGAATCCAA TCTGTACATQ AAAGCTGTGA 1560 

GAGCTACTCT TATCTTGGTG CCATTGCTTG GCATTGAATT TGTGCTGATT CCATGGCGAC 1620 

CTGAAGGAAA GATTGCAGAG GAGGTATATG ACTACATCAT GCACATCCTT ATGCACTTCC 1680 

AGGGTCTTTT GGTCTCTACC ATTTTCTGCT TCTTTAATGG AGAGGTTCAA GCAATTCTGA 1740 

GAAGAAACTG GAATCAATAC AAAATCCAAT TTGGAAACAG CTTTTCCAAC TCAGAAGCTC 1800 

TTCGTAGTGC GTCTTACACA GTGTCAACAA TCAGTGATGG TCCAGGTTAT AGTCATGACT 1860 

GTCCTAGTGA ACACTTAAAT GGAAAAAGCA TCCATGATAT TGAAAATGTT CTCTTAAAAC 1920 

CAGAAAATTT ATATAATTGA AAATAGAAGG ATGGTTGTCT CACTGTTTGG TGCTT CTCC T 1980 

AACTCAAGGA CTTGGACCCA TGACTCTGTA GCCAGAAGAC TTCAATATTA AATGACTTTG 2040 

GGGAATGTCA TAAAGAAGAG CCTTCACATG AAATTAGTAG TGTGTTGATA AGAGTGTAAC 2100 

ATCCAGCTCT ATGTGGGAAA AAAGAAATCC TGGTTTGTAA TGTTTGTCAG TAAATACTCC 2160 

CACTATGCCT GATGTGACGC TACTAACCTG ACATCACCAA GTGTGGAATT GGAGAAAAGC 2220 

ACAATCAACT TTTCTGAGCT GGTGTAAGCC AGTTCCAGCA CACCATTGAT GAATTCAAAC 2280 

AAATGGCTGT AAAACTAAAC ATACATGTTG GGCATGATTC TACCCTTATT CSCCC CAAGA 2340 

GACCTAGCTA AGGTCTATAA ACATGAAGGG AAAATTAGCT TTTAGTTTTA AAA CTCTTT A 2400 

TCCCATCTTG ATTGGGGCAG TTGACTTTTT TTTTTTCCCA GAGTGCCGTA GTCCTTTTTG 2460 

TAACTACCCT CTCAAATGGA CAATACCAGA AGTGAATTAT CCCTGCTGGC TTTCTTTTCT 2520 

CTATGAAAAG CAACTGAGTA CAATTGTTAT GATCTACTCA TTTGCTGACA CATCAGTTAT 2S80 

ATCTTGTGGC ATATCCATTG TGGAAACTGG ATGAACAGGA TGTATAATAT GCAATCTTAC 2640 

TTCTATATCA TTAGGAAAAC ATCTTAGTTG ATGCTACAAA ACACCTTGTC AACCTCTTCC 2700 

TGTCTTACCA AACAGTGGGA GGGAATTCCT AGCTGTAAAT ATAAATTTTG CCCTTCCATT 2760 

TCTACTGTAT AAACAAATTA GCAATCATTT TATATAAAGA AAATCAATGA AGGATTTCTT 2 820 

ATTTTCTTGG AATTTTGTAA AAAGAAATTG TGAAAAATGA GCTTGTAAAT ACTCCATTAT 2880 

TTTATTTTAT AGTCTCAAAT CAAATACATA CAACCTATGT AATTTTTAAA GCAAATATAT 2940 

AATGCAACAA TGTGTGTATG TTAATATCTG ATACTGTATC TGGGCTGATT TTTTAAATAA 3000 
AATAGAGTCT GGAATGCT 



Seq ID NO: 337 protein sequence 
Protein Accession Jh NP_005786.1 

1 11 21 31 41 51 

I I I I I I 

MEKKCTLYFL VLLPFFMILV TAELEESPED SIQLGVTRNK IMTAQYECYQ KIMQDPIQQA 60 

EGVYCNRTWD GWLCWNDVAA GTESMQLCPD YFQDPDPSEK VTKICDQDGN WFRHPASNRT 120 

WTNYTQCNVN THEKVKTALN LFYLTIIGHG LSIASLLISL GIFFYFKSLS CQRITLHKNL 180 

FFSFVCNSW TIIHLTAVAN NQALVATNPV SCKVSQFIHL YLMGCNYFWM LCEGIYLHTli 240 

IWAVFAEKQ HLMWYYFLGW GFPLIPACIH AIARSLYYND NCWISSDTHL I»YIIHGPICA 300 

ALLVNLFFLL NIVRVLITKL KVTHOAESNL YMKAVRATLI LVPLLGIEFV LIPWRPEGKI 360 

AEEVYDYIMH ILMHFQGLLV STIFCFFNGE VQAILRRNWN QYKIQFGNSF SNSEALRSAS 420 
YTVSTISDGP GYSHDCPSEH LNGKSIHDIE NVLLKPENLY N 

Seq ID NO: 338 dna sequence 
Nucleic Acid Accession #x NM_001795 
Coding sequence: 25-2379 

1 11 21 31 41 51 

I I I I I I 

GCACGATCTG TTCCTCCTGG GAAGATGCAG AGGCTCATGA TGCTCCTCGC CACATCGGGC 60 

GCCTGCCTGG GCCTGCTGGC AGTGGCAGCA GTGGCAGCAG CAGGTGCTAA CCCTGCCCAA 120 

CGGGACACCC ACAGCCTGCT GCCCACCCAC CGGCGCCAAA AGAGAGATTG GATTTGGAAC 180 

CAGATGCACA TTGATGAAGA GAAAAACACC TCACTTCCCC ATCATGTAGG CAAGATCAAG 240 

TCAAGCGTGA GTCGCAAGAA TGCCAAGTAC CTGCTCAAAG GAGAATATGT GGGCAAGGTC 300 

TTCCGGGTCG ATGCAGAGAC AGGAGACGTG TTCGCCATTG AGAGGCTGGA CCGGGAGAAT 360 

ATCTCAGAGT ACCACCTCAC TGCTGTCATT GTGGACAAGG ACACTGGTGA AAACCTGGAG 420 

ACTCCTTCCA GCTTCACCAT CAAAGTTCAT GACGTGAACG ACAACTGGCC TGTGTTCACG 480 

CATCGGTTGT TCAATGCGTC CGTGCCTGAG TCGTCGGCTG TGGGGACCTC AGTCATCTCT 540 

GTGACAGCAG TGGATGCAGA CGACCCCACT GTGGGAGACC ACGCCTCTGT CATGTACCAA 600 

ATCCTGAAGG GGAAAGAGTA TTTTGCCATC GATAATTCTG GACGTATTAT CACAATAACG 660 

AAAAGCTTGG ACCGAGAGAA GCAGGCCAGG TATGAGATCG TGGTGGAAGC GCGAGATGCC 720 

CAGGGCCTCC GGGGGGACTC GGGCACGGCC ACCGTGCTGG TCACTCTGCA AGACATCAAT 780 

GACAACTTCC CCTTCTTCAC CCAGACCAAG TACACATTTG TCGTGCCTGA AGACACCCGT 840 

GTGGGCACCT CTGTGGGCTC TCTGTTTGTT GAGGACCCAG ATGAGCCCCA GAACCGGATG 900 

ACCAAGTACA GCATCTTGCG GGGCGACTAC CAGGACGCTT TCACCATTGA GACAAACCCC 960 

GCCCACAACG AGGGCATCAT CAAGCCCATG AAGCCTCTGG ATTATGAATA CATCCAGCAA 1020 

TACAGCTTCA TCGTCGAGGC CACAGACCCC ACCATCGACC TCCGATACAT GAGCCCTCCC 1080 

GCGGGAAACA GAGCCCAGGT CATTATCAAC ATCACAGATG TGGACGAGCC CCCCATTTTC 1140 

CAGCAGCCTT TCTACCACTT CCAGCTGAAG GAAAACCAGA AGAAGCCTCT GATTGGCACA 1200 

GTGCTGGCCA TGGACCCTGA TGCGGCTAGG CATAGCATTG GATACTCCAT CCGCAGGACC 1260 

AGTGACAAGG GCCAGTTCTT CCGAGTCACA AAAAAGGGGG ACATTTACAA TGAGAAAGAA 1320 

CTGGACAGAG AAGTCTACCC CTGGTATAAC CTGACTGTGG AGGCCAAAGA ACTGGATTCC 1380 

ACTGGAACCC CCACAGGAAA AGAATCCATT GTGCAAGTCC ACATTGAAGT TTTGGATGAG 1440 

AATGACAATG CCCCGGAGTT TGCCAAGCCC TACCAGCCCA AAGTGTGTGA GAACGCTGTC 1500 

CATGGCCAGC TGGTCCTGCA GATCTCCGCA ATAGACAAGG ACATAACACC ACGAAACGTG 1560 

AAGTTCAAAT TCACCTTGAA TACTGAGAAC AACTTTACCC TCACGGATAA TCACGATAAC 1620 

ACGGCCAACA TCACAGTCAA GTATGGGCAG TTTGACCGGG AGCATACCAA GGTCCACTTC 1680 

CTACCCGTGG TCATCTCAGA CAATGGGATG CCAAGTCGCA CGGGCACCAG CACGCTGACC 1740 

GTGGCCGTGT GCAAGTGCAA CGAGCAGGGC GAGTTCACCT TCTGCGAGGA TATGGCCGCC 1800 

CAGGTGGGCG TGAGCATCCA GGCAGTGGTA GCCATCTTAC TCTGCATCCT CACCATCACA 1860 

GTGATCACCC TGCTCATCTT CCTGCGGCGG CGGCTCCGGA AGCAGGCCCG CGCGCACGGC 1920 

AAGAGCGTGC CGGAGATCCA CGAGCAGCTG GTCACCTACG ACGAGGAGGG CGGCGGCGAG 1980 

ATGGACACCA CCAGCTACGA TGTGTCGGTG CTCAACTCGG TGCGCCGCGG CGGGGCCAAG 2040 

CCCCCGCGGC CCGCGCTGGA CGCCCGGCCT TCCCTCTATG CGCAGGTGCA GAAGCCACCG 2100 

AGGCACGCGC CTGGGGCACA CGGAGGGCCC GGGGAGATGG CAGCCATGAT CGAGGTGAAG 2160* 

AAGGACGAGG CGGACCACGA CGGCGACGGC CCCCCCTACG ACACGCTGCA CATCTACGGC 2220 

TACGAGGGCT CCGAGTCCAT AGCCGAGTCC CTCAGCTCCC TGGGCACCGA CTCATCCGAC 2280 
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TCTGACGTGG ATTACGACTT CCTTAACGAC TGGGGACCCA GGTTTAAGAT GCTGGCTGAG 2340 

CTGTACQGCT CGGACCCCCG GGAGGAGCTG CTGTATTAGG CGGCCGAGGT CACTCTGGGC 2400 

CTGGGGACCC AAACCCCCTG CAGCCCAGGC CAGTCAGACT CCAGGCACCA CAGCCTCCAA 2460 

AAATGGCAGT GACTCCCCAG CCCAGCACCC CTTCCTCGTG GGTCCCAGAG ACCTCATCAG 2S20 

CCTTGGGATA GCAAACTCCA GGTTCCTGAA ATATCCAGGA ATATATGTCA GTGATGACTA 2580 

TTCTCAAATG CTGGCAAATC CAGGCTGGTG TTCTGTCTGG GCTCAGACAT CCACATAACC 2640 

CTGTCACCCA CAGACOGCCG TCTAACTCAA AGACTTCCTC TGGCTCCCCA AGGCTGCAAA 2700 

GCAAAACAGA CTGTGTTTAA CTGCTGCAGG GTCTTTTTCT AGGGTCCCTG AACGCCCTGG 2760 

TAAGGCTGGT GAGGTCCTGG TGCCTATCTG CCTGGAGGCA AAGGCCTGGA CAGCTTGACT 2820 

TGTGGGGCAG GATTCTCTGC AGCCCATTCC CAAGGGAGAC TGACCATCAT GCCCTCTCTC 2880 

GGGAGCCCTA GCCCTGCTCC AACTCCATAC TCCACTCCAA GTGCCCCACC ACTCCCCAAC 2940 

CCCTCTCCAG GCCTGTCAAG AGGGAGGAAG GGGCCCCATG GCAGCTCCTG ACCTTGGGTC 3000 

CTGAAGTGAC CTCACTGGCC TGCCATGCCA GTAACTGTGC TGTACTGAGC ACTGAACCAC 3060 

ATTCAGGGAA ATGCTTATTA AACCTTGAAG CAACTGTGAA TTCATTCTGG AGGGGCAGTG 3120 

GAGATCAGGA GTGACAGATC ACAGGGTGAG GGCCACCTCC ACACCCACCC CCTCTGGAGA 3180 

AGGCCTGGAA GAGCTGAGAC CTTGCTTTGA GACTCCTCAG CACCCCTCCA GTTTTGCCTG 3240 

AGAAGGGGCA GATGTTCCCG GAGATCAGAA GACGTCTCCC CTTCTCTGCC TCACCTGGTC 3300 

GCCAATCCAT GCTCTCTTTC TTTTCTCTGT CTACTCCTTA TCCCTTGGTT TAGAGGAACC 3360 

CAAGATGTGG CCTTTAGCAA AACTGACAAT GTCCAAACCC ACTCATGACT GCATGACGGA 3420 

GCCGAGCATG TGTCTTTACA CCTOGCTGTT GTCACATCTC AGGGAACTGA CCCTCAGGCA 3480 

CACCTTGCAG AAGGAAGGCC CTGCCCTGCC CAACCTCTGT GGTCACCCAT GCATCATTCC 3540 

ACTGGAACGT TTCACTGCAA ACACACCTTG GAGAAGTGGC ATCAGTCAAC AGAGAGGGGC 3600 

AGGGAAGGAG ACACCAAGCT CACCCTTCGT CATGGACCGA GGTTCCCACT CTGGCAAAGC 3660 

CCCTCACACT GCAAGGGATT GTAGATAACA CTGACTTGTT TGTTTTAACC AATAACTAGC 3720 

TTCTTATAAT GATTTTTTTA CTAATGATAC TTACAAGTTT CTAGCTCTCA CAGACATATA 3780 

GAATAAGGGT TTTTGCATAA TAAGCAGGTT GTTATTTAGG TTAACAATAT TAATTCAGGT 3840 

TTTTTAGTTG GAAAAACAAT TCCTGTAACC TTCTATTTTC TATAATTGTA GTAATTGCTC .3900 

TACAGATAAT GTCTATATAT TGGCCAAACT GGTGCATGAC AAGTACTGTA TTTTTTTATA 3960 
CCTAAATAAA GAAAAATCTT TAGCCTGGGC AACAAAAAAA 

Seq ID HO: 339 Protein sequence 
Protein Accession #» NP_0017B6 

1 11 21 31 41 51 

I I I I I 1 

MQRLMMLLAT SGACLGLLAV AAVAAAGANP AQRDTHSLLP THRRQKRDWI WNQMHIDEEK 60 

NTSLPHHVGK IKSSVSRKNA KYLLKGEYVG KVFRVDAETG DVFAIERLDR ENISEYHLTA 120 

VIVDKDTGEN LETPSSFTIK VHDVNDNWPV FTHRI*FNASV PESSAVGTSV I SVTAVDADD 180 

PTVGDHASVM YQILKGKEYF AIDNSGRIIT ITKSLDREKO ARYEIWEAR DAQGLRGDSG 240 

TATVLVTLQD INDNFPFFTQ TKYTFWPED TRVGTSVGSL FVEDPDEPQN RMTKYSILRG 300 

DYQDAFTIET NPAHNEGIIK PMKPLDYEYI QQYSFIVEAT DPTIDLRYMS PPAGNRAQVI 360 

INITDVDEPP IFQQPFYHFQ LKENQKKPLI GTVLAMDPDA ARHSIGYSIR RTSDKGQPFR 420 

VTKKGDIYNE KELDREVYPW YNLTVEAKEL DSTGTPTGKE SIVQVHIEVL DENDNAPEFA 480 

KPYQPKVCEN AVHGQLVLQI SAIDKDITPR NVKFKFTLNT ENNFTLTDNH DNTANITVKY 540 

GQFDREHTKV HFLPWI SDN GMPSRTGTST LTVAVCKCNE QGEFTFCEDM AAQVGVSIQA 600 

WAILLCILT ITVITLLIFL RRRLRKQARA HGKSVPEIHE QLVTYDEEGG GEMDTTSYDV 660 

SVLNSVRRGG AKPPRPALDA RPSLYAQVQK PPRHAPGAHG GPGEMAAMIE VKKDEADHOG 720 

DGPPYDTLHI YGYEGSESIA ESLSSLGTDS SDSDVDYDFL NDKGPRFKMIi AELYGSDPRE 7B0 
ELLY 



Seq ID NO: 340 DNA sequence 
Nucleic Acid Accession #: NM_003088 
. Coding sequence t 112-1593 

1 11 21 31 41 51 

I I I I.I I 

GCGGAGGGTG CGTGCGGGCC GCGGCAGCCG AACAAAGGAG CAGGGGCGCC GCCGCAGGGA 60 

CCCGCCACCC ACCTCCCGGG GCCGCGCAGC GGCCTCTCGT CTACTGCCAC CATGACCGCC 120 

AACGGCACAG CCGAGGCGGT GCAGATCCAG TTCGGCCTCA TCAACTGCGG CAACAAGTAC 180 

CTGACGGCCG AGGCGTTCGG GTTCAAGGTG AACGCGTCCG CCAGCAGCCT GAAGAAGAAG 240 

CAGATCTGGA CGCTGGAGCA GCCCCCTGAC GAGGCGGGCA GCGCGGCCGT GTGCCTGCGC 300 

AGCCACCTGG GCCGCTACCT GGCGGCGGAC AAGGACGGCA ACGTGACCTG CGAGCGCGAG 360 

GTGCCCGGTC CCGACTGCCG TTTCCTCATC GTGGCGCACG ACGACGGTCG CTGGTCGCTG 420 

CAGTCCGAGG CGCACCGGCG CTACTTCGGC GGCACCGAGG ACCGCCTGTC CTGCTTCGCG 480 

CAGACGGTGT CCCCCGCCGA GAAGTGGAGC GTGCACATCG CCATGCACCC TCAGGTCAAC 540 

ATCTACAGTG TCACCOGTAA GCGCTACGCG CACCTGAGCG CGCGGCCGGC CGACGAGATC 600 

GCCGTGGACC GCGACGTGCC CTGGGGCGTC GACTCGCTCA TCACCCTCGC CTTCCAGGAC 660 

CAGCGCTACA GCGTGCAGAC CGCCGACCAC CGCTTCCTGC GCCACGACGG GCGCCTGGTG 720 

GCGCGCCCCG AGCCGGCCAC TGGCTACACG CTGGAGTTCC GCTCCGGCAA GGTGGCCTTC 780 

CGCGACTGCG AGGGCCGTTA CCTGGCGCCG TCGGGGCCCA GCGGCACGCT CAAGGCGGGC 840 

AAGGCCACCA AGGTGGGCAA GGACGAGCTC TTTGCTCTGG AGCAGAGCTG CGCCCAGGTC 900 

GTGCTGCAGG CGGCCAACGA GAGGAACGTG TCCACGCGCC AGGGTATGGA CCTGTCTGCC 960 

AATCAGGACG AGGAGACCGA CCAGGAGACC TTCCAGCTGG AGATCGACCG CGACACCAAA 1020 

AAGTGTGCCT TCCGTACCCA CACGGGCAAG TACTGGACGC TGACGGCCAC CGGGGGCGTG 1080 

CAGTCCACCG CCTCCAGCAA GAATGCCAGC TGCTACTTTG ACATCGAGTG GCGTGACCGG 1140 

CGCATCACAC TGAGGGCGTC CAATGGCAAG TTTGTGACCT CCAAGAAGAA TGGGCAGCTG 1200 

GCCGCCTCGG TGGAGACAGC AGGGGACTCA GAGCTCTTCC TCATGAAGCT CATCAACCGC 1260 

CCCATCATCG TGTTCCGCGG GGAGCATGGC TTCATCGGCT GCCGCAAGGT CACGGGCACC 1320 

CTGGACGCCA ACCGCTCCAG CTATGACGTC TTCCAGCTGG AGTTCAACGA TGGCGCCTAC 1380 

AACATCAAAG ACTCCACAGG CAAATACTGG ACGGTGGGCA GTGACTCCGC GGTCACCAGC 1440 

AGCGGCGACA CTCCTGTGGA CTTCTTCTTC GAGTTCTGCG ACTATAACAA GGTGGCCATC 1500 

AAGGTGGGCG GGCGCTACCT GAAGGGCGAC CACGCAGGCG TCCTGAAGGC CTCGGCGGAA 1560 

ACCGTGGACC CCGCCTCGCT CTGGGAGTAC TAGGGCCGGC CCGTCCTTCC CCGCCCCTGC 1620 

CCACATGGCG GCTCCTGCCA ACCCTCCCTG CTAACCCCTT CTCCGCCAGG TGGGCTCCAG 1680 

GGCGGGAGGC AAGCCCCCTT GCCTTTCAAA CTGGAAACCC CAGAGAAAAC GGTGCCCCCA 1740 

CCTGTCGCCC CTATGGACTC CCCACTCTCC CCTCCGCCCG GGTTCCCTAC TCCCCTCGGG 1800 
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TCAGCGGCTG CGGCCTGGCC CTGGGAGG6A TTTCAGATGC CCCTGCCCTC TTGTCTGCCA 1860 

CGGGGCQAGT CTGGCACCTC TTTCTTCTGA CCTCAGAOGG CTCTGAGCCT TATTTCTCTG 1920 

GAAGCGGCTA AGGGACGGTT GGGGGCTGGG AGCCCTGGGC GTGTAGTGTA ACTGGAATCT 1980 

TTTGCCTCTC CCAGCCACCT CCTCCCAGCC CCCCAGGAGA GCTGGGCACA TGTCCCAAGC 2040 

CTGTCAGTGG CCCTCCCTGG TGCACTGTCC CCGAAACCCC TGCTTGGGAA GGGAAGCTGT 2100 

CGGGAGGGCT AGGACTGACC CTTGTGGTGT TTTTTTGGGT GGTGGCTGGA AACAGCCCCT 2160 

CTCCCACGTG GGAGAGGCTC AGCCTGGCTC CCTTCCCTGG AGCGGCAGGG CGTGACGGCC 2220 

ACAGGGTCTG CCCGCTGCAC GTTCTGCCAA GGTGGTGGTG GCGGGCGGGT AGGGGTGTGG 2280 

GGGCCGTCTT CCTCCTGTCT CTTTCCTTTC ACCCTAGCCT GACTGGAAGC AGAAAATGAC 2340 

CAAATCAGTA TTTTTTTTAA TGAAATATTA TTGCTGGAGG CGTCCCAGGC AAGCCTGGCT 2400 

GTAGTAGCGA GTGATCTGGC GGGGGGCGTC TCAGCACCCT CCCCAGGGGG TGCATCTCAG 2460 

CCCCCTCTTT CCGTCCTTCC CGTCCAGCCC CAGCCCTGGG CCTGGGCTGC CGACACCTGG 2520 

GCCAGAGCCC CTGCTGTGAT TGGTGCTCCC TGGGCCTCCC GGGTGGATGA AGCCAGGCGT 2580 

CGCCCCCTCC GGGAGCCCTG GGGTGAGCCG CCGGGGCCCC CCTGCTGCCA GCCTCCCCCG 2640 

TCCCCAACAT GCATCTCACT CTGGGTGTCT TGGTCTTTTA TTTTTTGTAA GTGTCATTTG 2700 

TATAACTCTA AACGCCCATG ATAGTAGCTT CAAACTGGAA ATAGCGAAAT AAAATAACTC 2760 
AGTCTGC 



Seq ID NO i 341 Protein sequence 
Protein Accession th np_003079 
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MTANGTAEAV QIQFGLINCG KKYLTAEAFG FKVNASASSL KKKQIWTLEQ PPDEAGSAAV 60 

CLRSHLGRYL AADKDGNVTC EREVPGPDCR PLIVAHDDGR WSLQSEAHRR YFGGTEDRLS 120 

CFAQTVSPAB KWSVHIAMHP QVNIYSVTRK RYAHLSARPA DBIAVDRDVP WGVDSLITLA 180 

FQDQRYSVQT ADHRFLRHDG RLVARPEPAT GYTLEFRSGK VAFRDCEGRY LAPSGPSGTL 240 

KAGKATKVGK DELFALEQSC AQWLQAANE RNVSTRQGMD LSANQDEETD QETFQLEIDR 300 

DTKKCAFRTH TGKYWTLTAT GGVQSTASSK NASCYFDIEW RDRR1TLRAS NGKFVTSKKN 360 

GQLAASVETA GDSBLFLMKL INRPIIVFRG EHGFIGCRKV TGTLDANRSS YDVFQLEFND 420 

GAYNIKDSTG KYWTVGSDSA VTSSGDTPVD FFFEFCDYNK VAIKVGGRYL KGDHAGVLKA 480 
SAETVDPASL WEY 



Seq ID NO: 342 DNA sequence 

Nucleic Acid Accession Jh PGENESH predicted 

Coding sequence: 660.. 1705 

1 11 21 31 41 51 

I I I I I I 

CGCTCCGCAC ACATTTCCTG TCGCGGCCTA AGGGAAACTG TTGGCOGCTG GGCCCGCGGG 60 

GGGATTCTTG GCAGTTGGGG GGTCCGTCGG GAG0GAGGGC GGAGGGGAAG GGAGGGGGAA 120 

CCGGGTTGGG GAAGCCAGCT GTAGAGGGCG GTGACCGCGC TCCAGACACA GCTCTGCGTC 180 

CTCGAGCGGG ACAGATCCAA GTTGGGAGCA GCTCTGCGTG CGGGGCCTCA GAGAATGAGG 240 

CCGGCGTTCG CCCTGTGCCT CCTCTGGCAG GCGCTCTGGC CCGGGCCGGG CGGCGGCGAA 300 

CACCCCACTG CCGACCGTGC TGGCTGCTCG GCCTCGGGGG CCTGCTACAG CCTGCACCAC 360 

GCTACCATGA AGCGGCAGGC GGCCGAGGAG GCCTGCATCC TGCGAGGTGG GGCGCTCAGC 420 

ACCGTGCGTG CGGGCGCCGA GCTGCGCGCT GTGCTCGCGC TCCTGCGGGC AGGCCCAGGG 480 

CCCGGAGGGG GCTCCAAAGA CCTGCTGTTC TGGGTCGCAC TGGAGCGCAG GCGTTCCCAC 540 

TGCACCCTGG AGAACGAGCC TTTGCGGGGT TTCTCCTGGC TGTCCTCCGA CCCCGGCGGT 600 

CTCGAAAGCG ACACGCTGCA GTGGGTGGAG GAGCCCCAAC GCTCCTGCAC CGCGCGGAGA 660 

TGCGCGGTAC TCCAGGCCAC CGGTGGGGTC GAGCCCGCAG CTGGAAGGAG ATGCGATGCC 720 

ACCTGCGCGC CAACGGCTAC CTGTGCAAGT ACCAGTTTGA GGTCTTGTGT CCTGCGCCGC 780 

GCCCCGGGGC CGCCTCTAAC TTGAGCTATC GCGCGCCCTT CCAGCTGCAC AGCGCCGCTC 840 

TGGACTTCAG TCCACCTGGG ACCGAGGTGA GTGCGCTCTG CCGGGGACAG CTCCCGATCT 900 

CAGTTACTTG CATCGCGGAC GAAATCGGCG CTCGCTGGGA CAAACTCTCG GGCGATGTGT 960 

TGTGTCCCTG CCCCGGGAGG TACCTCCGTG CTGGCAAATG CGCAGAGCTC CCTAACTGCC 1020 

TAGACGACTT GGGAGGCTTT GCCTGCGAAT GTGCTACGGG CTTCGAGCTG GGGAAGGACG 1080 

GCCGCTCTTG TGTGACCAGT GGGGAAGGAC AGCCGACCCT TGGGGGGACC GGGGTGCCCA 1140 

CCAGGCGCCC GCCGGCCACT GCAACCAGCC CCGTGCCGCA GAGAACATGG CCAATCAGGG 1200 

TCGACGAGAA GCTGGGAGAG ACACCACTTG TCCCTGAACA AGACAATTCA GTAACATCTA 1260 

TTCCTGAGAT TCCTCGATGG GGATCACAGA GCACGATGTC TACCCTTCAA ATGTCCCTTC 1320 

AAGCCGAGTC AAAGGCCACT ATCACCCCAT CAGGGAGCGT GATTTCCAAG TTTAATTCTA 1380 

CGACTTCCTC TGCCACTCCT CAGGCTTTCG ACTCCTCCTC TGCCGTGGTC TTCATATTTG 1440 

TGAGCACAGC AGTAGTAGTG TTGGTGATCT TGACCATGAC AGTACTGGGG CTTGTCAAGC 1500 

TCTGCTTTCA CGAAAGCCCC TCTTCCCAGC CAAGGAAGGA GTCTATGGGC CCGCCGGGCC 1560 

TGGAGAGTGA TCCTGAGCCC GCTGCTTTGG GCTCCAGTTC TGCACATTGC ACAAACAATG 1620 

GGGTGAAAGT CGGGGACTGT GATCTGCGGG ACAGAGCAGA GGGTGCCTTG CTGGCGGAGT 1680 
CCCCTCTTGG CTCTAGTGAT G CAT AG 
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1 11 21 31 41 51 
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MGKDFMTKTP KAFATKAKID KWDLIKLKSF CTAKETI IRV NSQPTDWQKT FAIYPSDKGV 60 

IARIYKELEQ IYKKKKPTKT LRTHFLSRPK GNCWPLGPRG DSWQLGGPSG ARAEGKGGGT 120 

GLGKPAVEGG DRAPDTALRP RAGQIQVGSS SACGASENEA GVRPVPPLAG ALARAGRRRT 180 

PHCRPCWLLG LGGLLQPAPR YHEAAGGRGG LHPARWGAQH RACGRRAARC ARAPAGRPRA 240 

RRGLQRPAVL GRTGAQAFPL HPGERAFAGF LLAVIiRFRRS RKRHAAVGGG APTLLHRAEM 300 

RGTPGHRWGR ARSWKEMRCH LRANGYLCKY QFEVLCPAPR PGAASNLSYR APFQLHSAAL 360 

DFSPPGTEVS ALCRGQLPIS VTCIADEIGA RWDKLSGDVL CPCPGRYLRA GKCAELPNCL 420 

DDLGGPACEC ATGFELGKDG RSCVTSGEGQ PTLGGTGVPT RRPPATATSP VPQRTWPIRV 480 

DEKlrGETPLV PEQDNSVTSI PEIPRWGSQS TMSTLQMSLQ AE SKAT I TPS GSVISKFNST 540 

TSSATPQAFD SSSAWFIFV STAWVLVIL TMTVbGLVKL CFHESPSSQP RKESMGPPGL 600 
ESDPEPAALG SSSAHCTNNG VKVQDCDLRD RAEGALLAES PLGSSDA 
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Seq ID NO: 344 DNA sequence 
Nucleic Acid Accession g: NM_012072 
Coding sequence i 149-2107 



11 



21 31 41 51 

| | I I 

AAAGCCCTCA GCCTTTGTGT CCTTCTCTGC GCCGGAGTGG CTGCAGCTCA CCCCTCAGCT 
CCCCTTGGGG CCCAGCTGGG AGCCGAGATA GAAGCTCCTG TCGCCGCTGG GCTTCTCGCC 
TCCCGCAGAG GGCCACACAG AGACCGGGAT GGCCACCTCC ATGGGCCTGC TGCTGCTGCT 
GCTGCTGCTC CTGACCCAGC CCGGGGCGGG GACGGGAGCT GACAOGGAGG CGGTGGTCTG 
CGTGGGGACC GCCTGCTACA CX3GCCCACTC GGGCAAGCTG AGCGCTGCCG AGGCCCAGAA 
CCACTGCAAC CAGAAOGGGG GCAACCTGGC CACTGTGAAG AGCAAGGAGG AGGCCCAGCA 
CGTCCAGCGA GTACTGGCCC AGCTCCTGAG GCGGGAGGCA GCCCTGAOGG CGAGGATGAG 
CAAGTTCTGG ATTGGGCTCC AGCGAGAGAA GGGCAAGTGC CTGGACCCTA GTCTGCCGCT 
GAAGGGCTTC AGCTGGGTGG GCGGGGGGGA GGACACGCCT TACTCTAACT GGCACAAGGA 
GCTCCGGAAC TCGTGCATCT CCAAGCGCTG TGTGTCTCTG CTGCTGGACC TGTCCCAGCC 
GCTCCTTCCC AACCGCCTGC CCAAGTGGTC TGAGGGCCCC TGTGGGAGCC CAGGCTCCCC 
CGGAAGTAAC ATTGAGGGCT TCGTGTGCAA GTTCAGCTTC AAAGGCATGT GCCGGCCTCT 
GGCCCTGGGG GGCCCAGGTC AGGTGACCTA CACCACCCCC TTCCAGACCA CCAGTTCCTC 
CTTGGAGGCT GTGCCCTTTG CCTCTGCGGC CAATGTAGCC TGTGGGGAAG GTGACAAGGA 
CGAGACTCAG AGTCATTATT TCCTGTGCAA GGAGAAGGCC CCCGATGTGT TCGACTGGGG 
CAGCTCGGGC CCCCTCTGTG TCAGCCCCAA GTATGGCTGC AACTTCAACA ATGGGGGCTG 
CCACCAGGAC TGCTTTGAAG GGGGGGATGG CTCCTTCCTC TGCGGCTGCC GACCAGGATT 
CCGGCTGCTG GATGACCTGG TGACCTGTGC CTCTCGAAAC CCTTGCAGCT CCAGCCCATG 
TCGTGGGGGG GCCACGTGCG TCCTGGGACC CCATGGGAAA AACTACACGT GCCGCTGCCC 
CCAAGGGTAC CAGCTGGACT CGAGTCAGCT GGACTGTGTG GACGTGGATG AATGCCAGGA 
CTCCCCCTGT GCCCAGGAGT GTGTCAACAC CCCTGGGGGC TTCCGCTGCG AATGCTGGGT 
TGGCTATGAG CCGGGCGGTC CTGGAGAGGG GGCCTGTCAG GATGTGGATG AGTGTGCTCT 
GGGTCGCTCG CCTTGCGCCC AGGGCTGCAC CAACACAGAT GGCTCATTTC ACTGCTCCTG 
TGAGGAGGGC TACGTCCTGG CCGGGGAGGA CGGGACTCAG TGCCAGGACG TGGATGAGTG 
TGTGGGCCCG GGGGGCCCCC TCTGCGACAG CTTGTGCTTC AACACACAAG GGTCCTTCCA 
CTGTGGCTGC CTGCCAGGCT GGGTGCTGGC CCCAAATGGG GTCTCTTGCA CCATGGGGCC 
TGTGTCTCTG GGACCACCAT CTGGGCCCCC CGATGAGGAG GACAAAGGAG AGAAAGAAGG 
GAGCACCGTG CCCCGCGCTG CAACAGCCAG TCCCACAAGG GGCCCOGAGG GCACCCCCAA 
GGCTACACCC ACCACAAGTA GACCTTCGCT GTCATCTGAC GCCCCCATCA CATCTGCCCC 
ACTCAAGATG CTGGCCCCCA GTGGGTCCTC AGGCGTCTGG AGGGAGCCCA GCATCCATCA 
CGCCACAGCT GCCTCTGGCC CCCAGGAGCC TGCAGGTGGG GACTCCTCCG TGGCCACACA 
AAACAACGAT GGCACTGACG GGCAAAAGCT GCTTTTATTC TACATCCTAG GCACCGTGGT 
GGCCATCCTA CTCCTGCTGG CCCTGGCTCT GGGGCTACTG GTCTATCGCA AGCGGAGAGC 
GAAGAGGGAG GAGAAGAAGG AGAAGAAGCC CCAGAATGCG GCAGACAGTT ACTCCTGGGT 
TCCAGAGCGA GCTGAGAGCA GGGCCATGGA GAACCAGTAC AGTCCGACAC CTGGGACAGA 
CTGCTGAAAG TGAGGTGGCC CTAGAGACAC TAGAGTCACC AGCCACCATC CTCAGAGCTT 
TGAACTCCCC ATTCCAAAGG GGCACCCACA TTTTTTTGAA AGACTGGACT GGAATCTTAG 
CAAACAATTG TAAGTCTCCT CCTTAAAGGC CCCTTGGAAC ATGCAGGTAT TTTCTACGGG 
TGTTTGATGT TCCTGAAGTG GAAGCTGTGT GTTGGCGTGC CACGGTGGGG ATTTCGTGAC 
TCTATAATGA TTGTTACTCC CCCTCCCTTT TCAAATTCCA ATGTGACCAA TTCCGGATCA 
GGGTGTGAGG AGGCTGGGGC TAAGGGGCTC CCCTGAATAT CTTCTCTGCT CACTTCCACC 
ATCTAAGAGG AAAAGGTGAG TTGCTCATGC TGATTAGGAT TGAAATGATT TGTTTCTCTT 
CCTAGGATGA AAACTAAATC AATTAATTAT TCAATTAGGT AAGAAGATCT GGTTTTTTGG 
TCAAAGGGAA CATGTTCGGA CTGGAAACAT TTCTTTACAT TTGCATTCCT CCATTTCGCC 
AGCACAAGTC TTGCTAAATG TGATACTGTT GACATCCTCC AGAATGGCCA GAAGTGCAAT 
TAACCTCTTA GGTGGCAAGG AGGCAGGAAG TGCCTCTTTA GTTCTTACAT TTCTAATAGC 
CTTGGGTTTA TTTGCAAAGG AAGCTTGAAA AATATGAGAA AAGTTGCTTG AAGTGCATTA 
CAGGTGTTTG TGAAGTCACA TAATCTACGG GGCTAGGGCG AGAGAGGCCA GGGATTTGTT 
CACAGATACT TGAATTAATT CATCCAAATG TACTGAGGTT ACC ACACA CT TGACTACGGA 
TGTGATCAAC ACTAACAAGG AAACAAATTC AAGGACAACC TGTCTTTGAG CCAGGGCAGG 
CCTCAGACAC CCTGCCTGTG GCCCCGCCTC CACTTCATCC TGCCCGGAAT GCCAGTGCTC 
CGAGCTCAGA CAGAGGAAGC CCTGCAGAAA GTTCCATCAG GCTGTTTCCT AAAGGATGTG 
TGAACGGGAG ATGATGCACT GTGTTTTGAA AGTTGTCATT TTAAAGCATT TTAGCACAGT 
TCATAGTCCA CAGTTGATGC AGCATCCTGA GATTTTAAAT CCTGAAGTGT GGGTGGCGCA 
CACACCAAGT AGGGAGCTAG TCAGGCAGTT TGCTTAAGGA ACTTTTGTTC TCTGTCTCTT 
TTCCTTAAAA TTGGGGGTAA GGAGGGAAGG AAGAGGGAAA GAGATGACTA ACTAAAATCA 
TTTTTACAGC AAAAACTGCT CAAAGCCATT TAAATTATAT CCTCATTTTA AAAGTTACAT 
TTGCAAATAT TTCTCCCTAT GATAATGCAG TCGATAGTGT GCACTCTTTC TCTCTCTCTC 
TCTCTCTCAC ACACACACAC ACACACACAC ACACACACAC AGAGACACGG CACCATTCTG 
CCTGGGGCAC TGGAACACAT TCCTGGGGGT CACCGATGGT CAGAGTCACT AGAAGTTACC 
TGAGTATCTC TGGGAGGCCT CATGTCTCCT GTGGGCTTTT TACCACCACT GTGCAGGAGA 
ACAGACAGAG GAAATGTGTC TCCCTCCAAG GCCCCAAAGC CTCAGAGAAA GGGTGTTTCT 
GGTTTTGCCT TAGCAATGCA TCGGTCTCTG AGGTGACACT CTGGAGTGGT TGAAGGGCCA 
CAAGGTGCAG GGTTAATACT CTTGCCAGTT TTGAAATATA GATGCTATGG TTCAGATTGT 
TTTTAATAGA AAACTAAAGQ GGCAGGGGAA GTGAAAGGAA AGATGGAGGT TTTGTGCGGC 
TCGATGGGGC ATTTGGAACT TCTTTTTAAA GTCATCTCAT GGTCTCCAGT TTTCAGTTGG 
AACTCTGGTG TTTAACACTT AAGGGAGACA AAGGCTGTGT CCATTTGGCA AAACTTCCTT 
GGCCACGAGA CTCTAGGTGA TGTGTGAAGC TGGGCAGTCT GTGGTGTGGA GAGCAGCCAT 
CTGTCTGGCC ATTCAGAGGA TTCTAAAGAC ATGGCTGGAT GOGCTGCTGA CCAACATCAG 
CACTTAAATA AATGCAAATG CAACATTTCT CCCTCTGGGC CTTGAAAATC CTTGCCCTTA 
TCATTTGGGG TGAAGGAGAC ATTTCTGTCC TTGGCTTCCC ACAGCCCCAA CGCAGTCTGT 
GTATGATTCC TGGGATCCAA CGAGCCCTCC TATTTTCACA GTGTTCTGAT TGCTCTCACA 
GCCCAGGCCC ATCGTCTGTT CTCTGAATGC AGCCCTGTTC TCAACAACAG GGAGGTCATG 
GAACCCCTCT GTGGAACCCA CAAGGGGAGA AATGGGTGAT AAAGAATCCA GTTCCTCAAA 
ACCTTCCCTG GCAGGCTGGG TCCCTCTCCT GCTGGGTGGT GCTTTCTCTT GCACACCACT 
CCCACCACGG GGGGAGAGCC AGCAACCCAA CCAGACAGCT CAGGTTGTGC ATCTGATGGA 
AACCACTGGG CTCAAACACG TGCTTTATTC TCCTGTTTAT TTTTGCTGTT ACTTTGAAGC 
ATGGAAATTC TTGTTTGGGG GATCTTGGGG CTACAGTAGT GGGTAAACAA ATGCCCACCG 
GCCAAGAGGC CATTAACAAA TCGTCCTTGT CCTGAGGGGC CCCAGCTTGC TCGGGCGTGG 
CACAGTGGGG AATCCAAGGG TCACAGTATG GGGAGAGGTG CACCCTGCCA CCTGCTAACT 
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TCTCGCTAGA CACAGTOTTT CTGCCCAGGT GACCTGTTCA GCAGCAGAAC AAGCCAGGGC 4860 

CATGGGGACG GGGGAAGTTT TCACTTGGAG ATGGACACCA AGACAATGAA GATTTGTTGT 4920 

CCAAATAGGT CAATAATTCT GGGAGACTCT TGGAAAAAAC TGAATATATT CAGGACCAAC 4980 

TCTCTCCCTC CCCTCATCCC ACATCTCAAA GCAGACAATG TAAAGAGAGA ACATCTCACA 5040 

CACCCAGCTC GCCATGCCTA CTCATTCCTG AATTTCAGGT GCCATCACTG CTCTTTCTTT 5100 

CTTCTTTGTC ATTTGAGAAA GGATGCAGGA GGACAATTCC CACAGATAAT CTGAGGAATG 5160 

CAGAAAAACC AGGGCAGGAC AGTTATCGAC AATGCATTAG AACTTGGTGA GCATCCTCTG 5220 

TAGAGGGACT CCACCCCTGC TCAACAGCTT GGCTTCCAGG CAAGACCAAC CACATCTGGT 5280 

CTCTGCCTTC GGTGGCCCAC ACACCTAAGC GTCATCGTCA TTGCCATAGC ATCATGATGC 5340 

AACACATCTA CGTGTAGCAC TACGACCTTA TGTTTGGGTA ATGTGGGGAT GAACTGCATG 5400 

AGGCTCTGAT TAAGGATGTG GGGAAGTGGG CTGCGGTCAC TGTCGGCCTT GCAAGGCCAC 5460 

CTGGAGGCCT GTCTGTTAGC CAGTGGTGGA GGAGCAAGGC TTCAGGAAGG GCCAGCCACA 5520 

TGCCATCTTC CCTGCGATCA GGCAAAAAAG TGGAATTAAA AAGTCAAACC TTTATATGCA 5580 

TGTGTTATGT CCATTTTGCA GGATGAACTG AGTTTAAAAG AATTTTTTTT TCTCTTCAAG 5640 

TTGCTTTGTC TTTTCCATCC TCATCACAAG CCCTTGTTTG AGTGTCTTAT CCCTGAGCAA 5700 

TCTTTCGATG GATGGAGATG ATCATTAGGT ACTTTTGTTT CAACCTTTAT TCCTGTAAAT 5760 

ATTTCTGTGA AAACTAGGAG AACAGAGATG AGATTTGACA AAAAAAAATT GAATTAAAAA 5820 

TAACACAGTC TTTTTAAAAC TAACATAGGA AAGCCTTTCC TATTATTTCT CTTCTTAGCT 5880 

TCTCCATTGT CTAAATCAGG AAAACAGGAA AACACAGCTT TCTAGCAGCT GCAAAATGGT 5940 

TTAATGCCCC CTACATATTT CCATCACCTT GAACAATAGC TTTAGCTTGG GAATCTGAGA 6000 

TATGATCCCA GAAAACATCT GTCTCTACTT CGGCTGCAAA ACCCATGGTT TAAATCTATA 6060 

TGGTTTGTGC ATTTTCTCAA CTAAAAATAG AGATGATAAT CCGAATTCTC CAT AT ATT CA 6120 

CTAATCAAAG ACACTATTTT CATACTAGAT TCCTGAGACA AATACTCACT GAAGGGCTTG 6180 

TTTAAAAATA AATTGTGTTT TGGTCTGTTC TTGTAGATAA TGCCCTTCTA TTTTAGGTAG 6240 

AAGCTCTGGA ATCCCTTTAT TGTGCTGTTG CTCTTATCTG CAAGGTGGCA AGCAGTTCTT 6300 

TTCAGCAGAT TTTGCCCACT ATTCCTCTGA GCTGAAGTTC TTTGCATAGA TTTGGCTTAA 6360 

GCTTGAATTA GATCCCTGCA AAGGCTTGCT CTGTGATGTC AGATGTAATT GTAAATGTCA 6420 

GTAATCACTT CATGAATGCT AAATGAGAAT GTAAGTATTT TTAAATGTGT GTATTTCAAA 6480 

TTTGTTTGAC TAATTCTGGA ATTACAAGAT TTCTATGCAG GATTTACCTT CATCCTGTGC 6540 

ATGTTTCCCA AACTGTGAGG AGGGAAGGCT CAGAGATCGA GCTTCTCCTC TGAGTTCTAA 6600 

CAAAATGGTG CTTTGAGGGT CAGCCTTTAG GAAGGTGCAG CTTTGTTGTC CTTTGAGCTT 6660 
TCTGTTATGT GCCTATCCTA ATAAACTCTT AAACACATT 



Seq ID NO* 345 Protein sequence 
Protein Accession #i NP_036204 



1 11 21 31 41 51 

I I I I I I 

MATSMGLLLL LLLLLTQPGA GTGADTEAW CVGTACYTAH SGKLSAAEAQ NHCNQNGGNL 60 

ATVKSKEEAQ HVQRVLAQLL RREAALTARM SKFWZGLQRB KGKCLDPSLP LKGFSWVGGG 120 

EDTPYSNWHK BLRNSCISKR CVSLLLDLSQ PLLPNRLPKW SEGPCGSPGS PGSNIEGFVC 180 

KFSFKGMCRP LALGGPGQVT YTTPFQTTSS SLEAVPFASA ANVACGEGDK DETQSHYFLC 240 

KEKAPDVFDW GSSGPLCVSP KYGCNFNNGG CHQDCFEGGD GSFLCGCRPG FRLLDDLVTC 3 00 

ASRNPCSSSP CRGGATCVLG PHGKNYTCRC PQGYQLDSSQ LDCVDVDECQ DSPCAQECVN 360 

TPGGFRCECW VGYEPGGPGE GACQDVDECA LGRSPCAQGC TNTDGSFHCS CEEGYVLAGE 420 

DGTQCQDVDE CVGPGGPLCD SLCFNTQGSF HCGCLPGWVL APNGVSCTMG PVSLGPPSGP 480 

PDEEDKGEKE GSTVPRAATA SPTRGPEGTP KATPTTSRPS LSSDAPITSA PLKMLAPSGS 540 

SGVWREPSIH HATAASGPQE PAGGDSSVAT QNNDGTDGQK LLLFYILGTV VAILLLLALA 600 
LGLLVYRKRR AKREEKKEKK PQNAADSYSW VPERAESRAM ENQYSPTPGT DC 



Seq ID NO: 346 DNA sequence 
Nucleic Acid Accession #: Z31560 
Coding sequence: <l-966 

1 11 21 31 41 51 

I I I I I I 

CACAGCGCCC GCATGTACAA CATGATGGAG ACGGAGCTGA AGCCGCCGGG CCCGCAGCAA 60 

ACTTCGGGGG GCGGCGGCGG CAACTCCACC GCGGCGGCGG CCGGCGGCAA CCAGAAAAAC 120 

AGCCCGGACC GCGTCAAGCG GCCCATGAAT GCCTTCATGG TGTGGTCCCG CGGGCAGCGG 180 

CGCAAGATGG CCCAGGAGAA CCCCAAGATG CACAACTCGG AGATCAGCAA GCGCCTGGGC 240 

GCCGAGTGGA AACTTTTGTC GGAGACGGAG AAGCGGCCGT TCATCGACGA GGCTAAGCGG 300 

CTGCGAGCGC TGCACATGAA GGAGCACCCG GATTATAAAT ACCGGCCCCG GCGGAAAACC 360 

AAGACGCTCA TGAAGAAGGA TAAGTACACG CTGCCCGGCG GGCTGCTGGC CCCCGGCGGC 420 

AATAGCATGG CGAGCGGGGT CGGGGTGGGC GCCGGCCTGG GCGCGGGCGT GAACCAGCGC 480 

ATGGACAGTT ACGCGCACAT GAACGGCTGG AGCAACGGCA GCTACAGCAT GATGCAGGAC 540 

CAGCTGGGCT ACCCGCAGCA CCCGGGCCTC AATGCGCACG GCGCAGCGCA GATGCAGCCC 600 

ATGCACCGCT ACGACGTGAG CGCCCTGCAG TACAACTCCA TGACCAGCTC GCAGACCTAC 660 

ATGAACGGCT CGCCCACCTA CAGCATGTCC TACTCGCAGC AGGGCACCCC TGGCATGGCT 720 

CTTGGCTCCA TGGGTTCGGT GGTCAAGTCC GAGGCCAGCT CCAGCCCCCC TGTGGTTACC 780 

TCTTCCTCCC ACTCCAGGGC GCCCTGCCAG GCCGGGGACC TCCGGGACAT GATCAGCATG 840 

TATCTCCCCG GCGCCGAGGT GCCGGAACCC GCCGCCCCCA GCAGACTTCA CATGTCCCAG 900 

CACTACCAGA GCGGCCCGGT GCCCGGCACG GCCATTAACG GCACACTGCC CCTCTCACAC 960 

ATGTGAGGGC CGGACAGCGA ACTGGAGGGG GGAGAAATTT TCAAAGAAAA ACGAGGGAAA 1020 

TGGGAGGGGT GCAAAAGAGG AGAGTAAGAA ACAGCATGGA GAAAACCCGG TACGCTCAAA 1080 
AAAAA 



Seq ID NO: 347 Protein sequence 
Protein Accession #: CAA83435 



1 11 21 31 41 51 

I I I I I I 

HSARMYNMME TELKPPGPQQ TSGGGGGNST AAAAGGNQKN SPDRVKRPMN AFMVWSRGQR 60 
RKMAQENPKM HNSEISKRLG AEWKLLSETE KRPFIDEAKR LRALHMKEHP DYKYRPRRKT 120 
KTLMKKDKYT LPGGLLAPGG NSMASGVGVG AGLGAGVNQR MDSYAHMNGW SNGSYSMMQD 180 
QLGYPQHPGL NAHGAAQMQP MHRYDVSALQ YNSMTSSQTY MNGSPTYSMS YSQQGTPGMA 240 
LGSKGSWKS EASSSPPWT SSSHSRAPCQ AGDLRDMISM YLPGAEVPEP AAPSRLHMSQ 300 
HYQSGPVPGT AINGTLPLSH M 
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Seq ID NO: 34 B DNA sequence 
Nucleotide Accession #: NM_002638 
Coding sequence: 120-473 

1 11 21 31 41 51 

I I I I I I 

CAATACAGCT AAGGAATTAT CCCTTGTAAA TACCACAGAC CCGCCCTGGA GCCAGGCCAA 60 

GCT6GACT0C ATAAAGATTG GTATGGCCTT AGCTCTTAGC CAAACACCTT CCTGACACCA 120 

TGAGGGCCAG CAGCTTCTTG ATCGTGGTGG TGTTCCTCAT CGCTGGGACG CTGGTTCTAG 180 

AGGCAGCTGT CACGGGAGTT CCTGTTAAAG GTCAAGACAC TGTCAAAGGC CGTGTTCCAT 240 

TCAATGGACA AGATCCCGTT AAAGGACAAG TTTCAGTTAA AGGTCAAGAT AAAGTCAAAG 300 

CGCAAGAGCC AGTCAAAGGT CCAGTCTCCA CTAAGCCTGG CTCCTGCCCC ATTATCTTGA 360 

TCCGGTGCGC CATGTTGAAT CCCCCTAACC GCTGCTTGAA AGATACTGAC TGCCCAGGAA 420 

TCAAGAAGTG CTGTGAAGGC TCTTGCGGGA TGGCCTGTTT CGTTCCCCAG TGAAGGGAGC 480 

CGGTCCTTGC TGCACCTGTG CCGTCCCCAG AGCTACAGGC CCCATCTGGT CCTAAGTCCC 540 

TGCTGCCCTT CCCCTTCCCA CACTGTCCAT TCTTCCTCCC ATTCAGGATG CCCACGGCTG 600 
GAGCTGCCTC TCTCATCCAC TTTCCAATAA A 

Seq ID NO: 349 Protein sequence: 
Protein Accession #: NP_002629 

1 11 21 31 41 51 

I I I I I I 

MRASSFLIW VFLIAGTLVL EAAVTGVPVK GQDTVKGRVP FNGQDPVKGQ VSVKGQDKVK 60 
AQEPVKGPVS TKPGSCPIIL IRCAMLNPPN RCLKDTDCPG IKKCCEGSCG MACFVPQ 



Seq ID NO: 350 DNA sequence 
Nucleic Acid Accession #: NMJ)07183 
Coding sequence: 75-2468 

1 11 21 31 41 51 

I I I 1 1 1 ^ 

GAATTCCGGA CAGGACGTGA AGATAGTTGG GTTTGGAGGC GGCCGCCAGG CCCAGGCCCG 60 

GTGGACCTGC CGCCATGCAG GACGGTAACT TCCTGCTGTC GGCCCTGCAG CCTGAGGCCG 120 

GCGTGTGCTC CCTGGCGCTG CCCTCTGACC TGCAGCTGGA CCGCCGGGGC GCCGAGGGGC 180 

CGGAGGCCGA GCGGCTGCGG GCAGCCCGCG TCCAGGAGCA GGTCCGCGCC CGCCTCTTGC 240 

AGCTGGGACA GCAGCCGCGG CACAACGGGG COGCTGAGCC CGAGCCTGAG GCCGAGACTG 300 

CCAGAGGCAC ATCCAGGGGG CAGTACCACA CCCTGCAGGC TGGCTTCAGC TCTCGCTCTC 360 

AGGGCCTGAG TGGGGACAAG ACCTCGGGCT TCCGGCCCAT CGCCAAGCCG GCCTACAGCC 420 

CAGCCTCCTG GTCCTCCCGC TCCGCCGTGG ATCTGAGCTG CAGTCGGAGG CTGAGTTCAG 480 

CCCACAATGG GGGCAGCGCC TTTGGGGCCG CTGGGTACGG GGGTGCCCAG CCCACCCCTC 540 

CCATGCCCAC CAGGCCCGTG TCCTTCCATG AGCGCGGTGG GGTTGGGAGC CGGGCCGACT 600 

ATGACACACT CTCCCTGCGC TCGCTGCGGC TGGGGCCCGG GGGCCTGGAC GACCGCTACA 660 

GCCTGGTGTC TGAGCAGCTG GAGCCCGCGG CCACCTCCAC CTACAGGGCC TTTGCGTACG 720 

AGCGCCAGGC CAGCTCCAGC TCCAGCCGGG CAGGGGGGCT GGACTGGCCC GAGGCCACTG 780 

AGGTTTCCCC GAGCCGGACC ATCCGTGCCC CTGCCGTGCG GACCCTGCAG CGATTCCAGA 840 

GCAGCCACCG GAGCCGCGGG GTAGGCGGGG CAGTGCCGGG GGCCGTCCTG GAGCCAGTGG 900 

CTCGAGCGCC ATCTGTGCGC AGCCTCAGCC TCAGCCTGGC TGACTCGGGC CACCTGCCGG 960 

ACGTGCATGG GTTCAACAGC TACGGTAGCC ACCGAACCCT GCAGAGACTC AGCAGCGGTT 1020 

TTGATGACAT TGACCTGCCC TCAGCAGTCA AGTACCTCAT GGCTTCAGAC CCCAACCTGC 1080 

AGGTGCTGGG AGCGGCCTAC ATCCAGCACA AGTGCTACAG CGATGCAGCC GCCAAGAAGC 1140 

AGGCCOGCAG CCTTCAGGCC GTGCCTAGGC TGGTGAAGCT CTTCAACCAC GCCAACCAGG 1200 

AAGTGCAGCG CCATGCCACA GGTGCCATGC GCAACCTCAT CTACGACAAC GCTGACAACA 1260 

AGCTGGCCCT GGTGGAGGAG AACGGGATCT TCGAGCTGCT GCGGACACTG CGGGAGCAGG 1320 

ATGATGAGCT TCGCAAAAAT GTCACAGGGA TCCTGTGGAA CCTTTCATCC AGCGACCACC 1380 

TGAAGGACCG CCTGGCCAGA GACACGCTGG AGCAGCTCAC GGACCTGGTG TTGAGCCCCC 1440 

TGTCGGGGGC TGGGGGTCCC CCCCTCATCC AGCAGAACGC CTCGGAGGCG GAGATCTTCT 1500 

ACAACGCCAC CGGCTTCCTC AGGAACCTCA GCTCAGCCTC TCAGGCCACT CGCCAGAAGA 1560 

TGCGGGAGTG CCACGGGCTG GTGGACGCCC TGGTCACCTC TATCAACCAC GCCCTGGACG 1620 

CGGGCAAATG CGAGGACAAG AGCGTGGAGA ACGCGGTGTG CGTCCTGCGG AACCTGTCCT 1680 

ACCGCCTCTA CGACGAGATG CCGCCGTCCG CGCTGCAGCG GCTGGAGGGT CGCGGCCGCA 1740 

GGGACCTGGC GGGGGCGCCG CCGGGAGAGG TCGTGGGCTG CTTCACGCCG CAGAGCCGGC 1800 

GGCTGCGCGA GCTGCCCCTC GCCGCCGATG CGCTCACCTT CGCGGAGGTG TCCAAGGACC 1860 

CCAAGGGCCT CGAGTGGCTG TGGAGCCCCC AGATCGTGGG GCTGTACAAC CGGCTGCTGC 1920 

AGCGCTGCGA GCTCAACCGG CACACGACGG AGGCGGCCGC CGGGGCGCTG CAGAACATCA 1980 

CGGCAGGCGA CCGCAGGTGG GCGGGGGTGC TGAGCCGCCT GGCCCTGGAG CAGGAGCGTA 2040 

TTCTGAACCC CCTGCTAGAC CGTGTCAGGA CCGCCGACCA CCACCAGCTG CGCTCACTGA 2100 

CTGGCCTCAT CCGAAACCTG TCTCGGAACG CTAGGAACAA GGACGAGATG TCCACGAAGG 2160 

TGGTGAGCCA CCTGATCGAG AAGCTGCCAG GCAGCGTGGG TGAGAAGTCG CCCCCAGCCG 2220 

AGGTGCTGGT CAACATCATA GCTGTGCTCA ACAACCTGGT GGTGGCCAGC CCCATCGCTG 2280 

CCCGAGACCT GCTGTATTTT GACGGACTCC GAAAGCTCAT CTTCATCAAG AAGAAGCGGG 2340 

ACAGCCCCGA CAGTGAGAAG TCCTCCCGGG CAGCATCCAG CCTCCTGGCC AACCTGTGGC 2400 

AGTACAACAA GCTCCACCGT GACTTTCGGG CGAAGGGCTA TCGGAAGGAG GACTTCCTGG 2460 

GCCCATAGGT GAAGCCTTCT GGAGGAGAAG GTGACGTGGC CCAGCGTCCA AGGGACAGAC 2520 

TCAGCTCCAG GCTGCTTGGC AGCCCAGCCT GGAGGAGAAG GCTAATGACG GAGGGGCCCC 2580 

TCGCTGGGGC CCCTGTGTGC ATCTTTGAGG GTCCTGGGCC ACCAGGAGGG GCAGGGTCTT 2640 

ATAGCTGGGG ACTTGGCTTC CGCAGGGCAG GGGGTGGGGC AGGGCTCAAG GCTGCTCTGG 2700 

TGTATGGGGT GGTGACCCAG TCACATTGGC AGAGGTGGGG GTTGGCTGTG GCCTGGCAGT 2760 

ATCTTGGGAT AGCCAGCACT GGGAATAAAG ATGGCCATGA ACAGTCACAA AAAAAAAAAA 2820 

AAAAGGAATT C 

Seq ID NO: 351 Protein sequence 
Protein Accession #i NP_009114.1 

1 11 21 31 41 51 
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MQDGNFLLSA 
PRHNGAAEPE 
SRSAVDLSCS 
LRSLRLGPGG 
RTIRAPAVRT 
NSYGSHRTLQ 
QAVPRLVRLF 
KNVTGILWNL 
FLRNLSSASQ 
EMPPSALQRL 
WLWSPQIVGL 
LDRVRTADHH 
IIAVLNNLW 
HRDFRAKGYR 



LQPEAGVCSL 
PEAETARGTS 
RRLSSAHNGG 
LDDRYSLVSE 
LQRFQSSHRS 
RLSSGFDDID 
NHANQEVQRH 
SSSDHLKDRL 
ATRQKMRECH 
EGRGRRDLAG 
YNRLLQRCBL 
QLRSLTGLIR 
ASPIAARDLL 
KEDPLGP 



ALPSDLQLDR 
RGQYHTLQAG 
SAFGAAGYGG 
QLEPAATSTY 
RGVGGAVPGA 
LPSAVKYLMA 
ATGAMRN&IY 
ARDTLEQLTD 
GLVDALVTS I 
APPGEWGCF 
NRHTTEAAAG 
NLSRNARNKD 
YFDGLRKLIF 



RGAEGPEAER 
FSSRSQGLSG 
AQPTPPMPTR 
RAFAYERQAS 
VLEPVARAPS 
SDPNLQVLGA 
DNADNKLALV 
LVLSPLSGAG 
NHALDAGKCE 
TPQSRRLREL 
ALQNITAGDR 
EMSTKWSHL 
IKKKRDSPDS 



LRAARVQEQV 
DKTSGFRPIA 
PVSFHERGGV 
SSSSRAGGLD 
VRSLSLSLAD 
AYIQHKCYSD 
EENGIFELLR 
GPPLIQQNAS 
DKSVENAVCV 
PLAADALTFA 
RWAGVLSRLA 
IEKLPGSVGE 
EKSSRAASSL 



RARLLQLGQQ 
KPAYSPASWS 
GSRADYDTLS 
WPEATEVSPS 
SGHLPDVHGF 
AAAKKQARSL 
TLREQDDELR 
EAEIFYNATG 
LRNLSYRLYD 
EVSKDPKGLE 
LEQERILNPL 
KSPPAEVLVN 
LANLWQYNKL 



Seq ID NO: 352 DNA sequence 
Nucleic Acid Accession ft: M31469 
Coding sequence: 1-651 



1 
I 

ATGGCTGCGC 
ACTGGAAAAA 
GCCACCTTGG 
TTCAATGTAT 
ATCCAAGCCC 
GTGCCTAACT 
GGCAACAAAG 
AAGAAGAATC 
TTCCTCTGGC 
GCTCTCGCCC 
TTAGAGGTTG 



11 
I 

AGGGAGAGCC 
CGACCTTCGT 
GTGTTGAGGT 
GGGACACAGC 
AGTGTGCCAT 
GGCATAGAGA 
TGGATATTAA 
TTCAGTACTA 
TTGCTAGGAA 
CACCAGAAGT 
CTCAGACAAC 



21 
I 

CCAGGTCCAG 
GAAACGTCAT 
TCATCCCCTA 
CGGCCAGGAG 
CATAATGTTT 
TCTGGTACGA 
GGACAGGAAA 
CGACATTTCT 
GCTCATTGGA 
TGTCATGGAC 
TGCTCTCCCG 



31 
I 

TTCAAACTTG 
TTGACTGGTG 
GTGTTCCACA 
AAATTCGGTG 
GATGTAACAT 
GTGTGTGAAA 
GTGAAGGCGA 
GCCAAAAGTA 
GACCCTAACT 
CCAGCTTTGG 
GATGAGGATG 



41 
I 

TATTGGTTGG 
AATTTGAGAA 
CCAACAGAGG 
GACTGAGAGA 
CGAGAGTTAC 
ACATCCCCAT 
AATCCATTGT 
ACTACAACTT 
TGGAATTTGT 
CAGCACAGTA 
ATGACCTGTG 



51 

I 

TGATGGTGGT 
GAAGTATGTA 
ACCTATTAAG 
TGGCTATTAT 
TTACAAGAAT 
TGTGTTGTGT 
CTTCCACCGA 
TGAAAAGCCC 
TGCCATGCCT 
TGAGCACGAC 
A 



Seq ID NO: 353 Protein sequence 
Protein Accession #: AAA36546 



11 



21 



31 



41 



51 



MAAQGEPQVQ FKLVLVGDGG TGKTTFVKRH LTGEFEKKYV ATLGVEVHPL VFHTNRGPIK 
FNVWDTAGQE KFGGLRDGYY IQAQCAIIMF DVTSRVTYKN VPNWHRDLVR VCENIPIVLC 
GNKVDIKDRK VKAKSIVFHR KKNLQYYDIS AKSNYNFEKP FLWLARKLIG DPNLEFVAMP 
ALAPPEWMD PALAAQYEHD LEVAQTTALP DEDDDI* 



Seq ID NO: 354 DNA sequence 
Nucleic Acid Accession fh NM_002820 
Coding sequence: 304-831 



CCGGTTCGCA 
CCCTGTTCCA 
CGTGTAAACA 
TTCAGAGGAA 
GTTTGGAGAA 
ACGATGCAGC 
GTGCCCTCCT 
GAACATCAGC 
CTTCACCATC 
CCTAACTCCA 
GAGGGCAGAT 
AAGACACCTG 
AAACGGCGAA 
GACCACCTGT 
CTGGCCCGTA 
GCTTGGACAA 
CAGAGAATAA 
TGTCCTCCAG 
CATCAATCCT 
ATCTTCATAA 
TTCTTCAGTG 
GATATTATCT 
ACTTTTTATT 
TAAATTATGT 
CCAGCTCATA 
GGTTTTTCTC 
CCGTAGGAAA 



11 
I 

AAGAAGCTGA 
CGAACCCAGG 
CACTACTTAT 
GCGCCTCTGA 
AGCACAGTTG 
GGAGACTGGT 
GCGGGCGCTC 
TCCTCCATGA 
TGATCGCAGA 
AGCCCTCTCC 
ACCTAACTCA 
GGAAGAAAAA 
CTCGCTCTGC 
CTGACACCTC 
GCCTCAGCGG 
ACCTAGAATT 
CTCAGAATAT 
CACCATAGAG 
TTACCACTCT 
TTTGCTGGAG 
TTTTTCATTT 
ACAAACACTG 
TAATTAAATG 
TTTAAACACA 
CAAAATAAAT 
ATGTATCTTT 
AATAAAACTT 



21 
I 

CTTCAGAGGG 
AGAACTGCTG 
CATTGATGCA 
TTTGTTTCTT 
GAGTAGCCGG 
TCAGCAGTGG 
GGTGGAGGGT 
CAAGGGGAAG 
AATCCACACA 
CAACACAAAG 
GGAAACTAAC 
GAAAGGCAAG 
CTGGTTAGAC 
CACAACGTCG 
GGTGCTCTCA 
TTCTCCCTTT 
TGTCTGCCTT 
AGGCGCTAGA 
ACCAAATAAT 
AAGTGTATTT 
CTTACGTTCT 
CAGAACAGCA 
TATTTAATTA 
TGCCTTAAAT 
GGTTTCTGAA 
TTGTTCATTG 
CACATTTAAA 



31 
I 

GGAAACTTTC 
GCCAGATTAA 
TATATAAAAC 
TTTTCCCTTT 
TTGCTAAATA 
AGCGTCGCGG 
CTCAGCCGCC 
TCCATCCAAG 
GCTGAAATCA 
AACCACCCCG 
AAGGTGGAGA 
CCCGGGAAAC 
TCTGGAGTGA 
CTGGAGCTCG 
GCTGGGTTTT 
ATGTATCTCT 
AAAGCAGTAC 
GCCCATTCCT 
TTCATATTCA 
CTTCCCCTTA 
TTCACTTCAA 
TCATGTCATA 
AATCTCAAAT 
TTGTTTAATT 
AATGTTTAAG 
GCAAGATGAA 
AAAAA 



41 
I 

TTCTTTTAGG 
TTAGACATTG 
CATTTTATTT 
TTGCTCTTTC 
AGTCCCGAGC 
TGTTCCTGCT 
GCCTCAAAAG 
ATTTACGGCG 
GAGCTACCTC 
TCCGATTTGG 
CGTACAAAGA 
GCAAGGAGCA 
CTGGGAGTGG 
ATTCACGGTA 
GGAGCCTCCC 
ATCGATTGTG 
CCCCCTACCA 
CTTTCTCCAC 
AGCTTCAGAA 
CTCTCACACC 
GGGAGAATAT 
AACGATTCTG 
TTATTTTAAT 
AAATTTAACT 
TATTAACTTA 
ATAATTTTTC 



51 
I 

AGGCGGTTAG 
CTATGGGAGA 
TCGCTATTAT 
TGGCTGTGTG 
GCGAGCGGAG 
GAGCTACGCG 
AGCTGTGTCT 
ACGATTCTTC 
GGAGGTGTCC 
GTCTGATGAT 
GCAGCCGCTC 
GGAAAAGAAA 
GCTAGAAGGG 
ACAGGCTTCT 
TTCTGCCTTG 
TAGCAATTGA 
CACACACCCC 
CGTCACCCAA 
GCTAGTGACC 
TGGGCAAACT 
AGAAGCATTT 
AGCCATTCAC 
GTAAAGAACT 
CTGGTTTCTA 
CAAGGATATA 
TAGGGTAATG 



Seq ID NOt 355 Protein sequence 
Protein Accession #: NM__002B20 



11 

I 



21 



31 



41 



51 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 



60 
120 
160 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



MQRRLVQQWS VAVPLLSYAV PSCGRSVEGL SRRLKRAVSE HQLLHDKGKS IQDLRRRFFL 
HHLIAEIHTA EIRATSEVSP NSKPSPNTKN HPVRFGSDDE GRYLTQETNK VETYKEQPLK 



60 
120 



319 
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TPGKKKKGKP GKRKEQEKKK RRTRSAWLDS GVTGSGLEGD HLSDTSTTSL BLOSR 

Seq ID NO: 356 DNA sequence 
Nucleic Acid Accession ft : NNJ>17522 
Coding sequence: 1-2100 

1 11 21 31 41 51 

I I I I I I 

ATGGGCCTCC CCGAGCCGGG CCCTCTCCGG CTTCTGGCGC TGCTGCTGCT GCTGCTGCTG 60 

CTGCTGCTGC TGCGGCTCCA GCATCTTGCG GCGGCAGCGG CTGATCCGCT GCTCGGCGGC 120 

CAAGGGCCGG CCAAGGAGTG CGAAAAGGAC CAATTCCAGT GCCGGAACGA GCGCTGCATC 180 

CCCTCTGTGT GGAGATGCGA CGAGGAOGAT GACTGCTTAG ACCACAGCGA CGAGGACGAC 240 

TGCCCCAAGA AGACCTGTGC AGACAGTGAC TTCACCTGTG ACAACGGCCA CTGCATCCAC 300 

GAACGGTGGA AGTGTGACGG CGAGGAGGAG TGTCCTGATG GCTCCGATGA GTCCGAGGCC 360 

ACTTGCACCA AGCAGGTGTG TCCTGCAGAG AAGCTGAGCT GTGGACCCAC CAGCCACAAG 420 

TGTGTACCTG CCTCGTGGCG CTGCGACGGG GAGAAGGACT GCGAGGGTGG AGCGGATGAG 480 

GCCGGCTGTG CTACCTCACT GGGCACCTGC CGTGGGGACG AGTTCCAGTG TGGGGATGGG 540 

ACATGTGTCC TTGCAATCAA GCACTGCAAC CAGGAGCAGG ACTGTCCAGA TGGGAGTGAT 600 

GAAGCTGGCT GCCTACAGGG GCTGAACGAG TGTCTGCACA ACAATGGCGG CTGCTCACAC 660 

ATCTGCACTG ACCTCAAGAT TGGCTTTGAA TGCACGTGCC CAGCAGGCTT CCAGCTCCTG 720 

GACCAGAAGA CTTGTGGCGA CATTGATGAG TGCAAGGACC CAGATGCCTG CAGCCAGATC 780 

TGTGTCAATT ACAAGGGCTA TTTTAAGTGT GAGTGCTACC CTGGCTGCGA GATGGACCTA 840 

CTGACCAAGA ACTGCAAGGC TGCTGCTGGC AAGAGCCCAT CCCTAATCTT CACCAACCGC 900 

ACGAGTGCGG AGGATCGACC TGTGAAGCGG AACTATTCAC GCCTCATCCC CATGCTCAAG 960 

AATGTCGTGG CACTAGATGT GGAAGTTGCC ACCAATCGCA TCTACTGGTG TGACCTCTCC 1020 

TACCGTAAGA TCTATAGCGC CTACATGGAC AAGGCCAGTG ACCCGAAAGA GCGGGAGGTC 1080 

CTCATTGACG AGCAGTTGCA CTCTCCAGAG GGCCTGGCAG TGGACTGGGT CCACAAGCAC 1140 

ATCTACTGGA CTGACTCGGG CAATAAGACC ATCTCAGTGG CCACAGTTGA TGGTGGCCGC 1200 

CGACGCACTC TCTTCAGCCG TAACCTCAGT GAACCCCGGG CCATCGCTGT TGACCCCCTG 1260 

CGAGGGTTCA TGTATTGGTC TGACTGGGGG GACCAGGCCA AGATTGAGAA ATCTGGGCTC 1320 

AACGGTGTGG ACCGGCAAAC ACTGGTGTCA GACAATATTG AATGGCCCAA CGGAATCACC 1380 

CTGGATCTGC TGAGCCAGCG CTTGTACTGG GTAGACTCCA AGCTACACCA ACTGTCCAGC 1440 

ATTGACTTCA GTGGAGGCAA CAGAAAGACG CTGATCTCCT CCACTGACTT CCTGAGCCAC 1500 

CCTTTTGGGA TAGCTGTGTT TGAGGACAAG GTGTTCTGGA CAGACCTGGA GAACGAGGCC 1560 

ATTTTCAGTG CAAATCGGCT CAATGGCCTG GAAATCTCCA TCCTGGCTGA GAACCTCAAC 1620 

AACCCACATG ACATTGTCAT CTTCCATGAG CTGAAGCAGC CAAGAGCTCC AGATGCCTGT 1680 

GAGCTGAGTG TCCAGCCTAA TGGAGGCTGT GAATACCTGT GCCTTCCTGC TCCTCAGATC 1740 

TCCAGCCACT CTCCCAAGTA CACATGTGCC TGTCCTGACA CAATGTGGCT GGGTCCAGAC 1800 

ATGAAGAGGT GCTACCGAGA TGCAAATGAA GACAGTAAGA TGGGCTCAAC AGTCACTGCC 1860 

GCTGTTATCG GGATCATCGT GCCCATAGTG GTGATAGCCC TCCTGTGCAT GAGTGGATAC 1920 

CTGATCTGGA GAAACTGGAA GCGGAAGAAC ACCAAAAGCA TGAATTTTGA CAACCCAGTC 1980 

TACAGGAAAA CAACAGAAGA AGAAGATGAA GATGAGCTCC ATATAGGGAG AACTGCTCAG 2040 

ATTGGCCATG TCTATCCTGC ACGAGTGGCA TTAAGCCTTG AAGATGATGG ACTACCCTGA 2100 

GGATGGGATC ACCCCCTTCG TGCCTCATGG AATTCAGTCC CATGCACTAC ACTCCGGATG 2160 

GTGTATGACT GGATGAATGG GTTTCTATAT ATGGGTCTGT GTGAGTGTAT GTGTGTGTGT 2220 

GATTTTTTTT TTTAAATTTA TGTTGCGGAA AGGTAACCAC AAAGTTATGA TGAACTGCAA 2280 

ACATCCAAAG GATGTGAGAG TTTTTCTATG TATAATGTTT TATACACTTT TTAACTGGTT 2340 

GCACTACCCA TGAGGAATTC GTGGAATGGC TACTGCTGAC TAACATGATG CACATAACCA 2400 

AATGGGGGCC AATGGCACAG TACCTTACTC ATCATTTAAA AACTATATTT ACAGAAGATG 2460 

TTTGGTTGCT GGGGGGCTTT TTTAGGTTTT GGGCATTTGT TTTTTGTAAA TAAGATGATT 2520 
ATGCTTTGTG GCTATCCATC AACATAAGT 

Seq ID NO: 357 Protein sequence 
Protein Accession ft: NP_059992 

1 11 21 31 41 51 

I I I I I I 

MGLPBPGPLR LLALLUJJjL LLLLRLQHLA AAAADPLLGG QGPAKECEKD QFQCRNERCI 60 

PSVWRCDEDD DCLDHSDEDD CPKKTCADSD PTCDNGHCIH ERWKCDGEEE CPDGSDESEA 120 

TCTKQVCPAE KLSCGPTSHK CVPASWRCDG EKDCEGGADE AGCATSLGTC RGDBFQCGDG 180 

TCVLAIKHCN QEQDCPDGSD EAGCLQGLNE CLHNNGGCSH ICTDLKIGFE CTCPAGFQIjL 240 

DQKTCGDIDE CKDPDACSQI CVNYKGYFKC ECYPGCEMDL LTKNCKAAAG KSPSLIFTNR 300 

TSAEDRPVKR NYSRLIPMLK NWALDVEVA TNRIYWCDLS YRKIYSAYMD KASDPKEREV 360 

LIDEQLHSPE GLAVDWVHKH IYWTDSGNKT ISVATVDGGR RRTLFSRNLS EPRAIAVDPL* 420 

RGFMYWSDWG DQAKIEKSGL NGVDRQTLVS DNIBWPNGIT LDLLSQRLYW VDSKLHQLSS 480 

IDFSGGNRKT LISSTDFLSH PFGIAVFEDK VFWTDLENEA IFSANRLNGL EISILAENLN 540 

NPHDIVIFHE LKQPRAPDAC ELSVQPNGGC EYLCLPAPQI SSHSPKYTCA CPDTMWLGFD 600 

MKRCYRDANE DSKMGSTVTA AVIGIIVPIV VIALLCMSGY LIWRNWKRKN TKSMNFDNPV 660 
YRKTTEEEDE DELHI GRTAQ IGHVYPARVA LSLEDDGLP 

Seq ID NO: 358 DNA sequence 
Nucleic Acid Accession ft: M27826 
Coding sequence: <1-S03 

1 11 21 31 41 51 

I I I I I I 

AGCCCAAGAA ACATCTCACC AATTTCAAAT CTGATCTATT CGGCTTAGCG ACTGAAGATT 60 

GACGCTGCCC GATCGCCTCG GAAGTCCCCT GGACCATCAC AGAAGCCGAG CTTCGGGTAA 120 

CTCTCACAGT GGAGGGTAAG TCCATCCCCT GTTTAATCGA TAOGGGGGCT ACCCACTCCA 180 

CGTTGCCTTC TTTTCAAGGG CCTGTTTCCC TTGCCCCCAT AACTGTTGTG GGTATTGACG 240 

GCCAAGCTTC AAAACCCCTG AAAACTCCCC CACTCTGGTG CCAACTTGGA CAACACTCTT 300 

TTATGCACTC TTTTTTAGTT ATCCCCACCT GCCCACTTCC CTTATTAGGC CGAAATATTT 360 

TAACCAAATT ATCTGCTTCC CTGACTATTC CTGGAGTACA GCTACATCTC ATTGCTGCCC 420 

TTCTTCCCAA TCCAAAGCCT CCTTTGTGTC CTCTAACATC CCCACAATAT CAGCCCTTAC 480 

CACAAGACCT CCCTTCAGCT TAATCTCTCC CACTCTAGGT TCCCACGCCG CCCCTAATCC 540 

CACTTGAAGC AGCCCTGAGA AACATCGCCC ATTCTCTCTC CATACCACCC CCCAAAAATT 600 

TTCGCCGCTC CAACACTTCA ACACTATTTT GTTTTATTTG TCTTATTAAT ATCAGAAGGC 660 



320 
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AGGAATGTCA GGCCTCTGAG CCCAGGCCAG GCCATCGCAT CCCCTGTGAC TTGCACGTAT 720 

ACATCCAGAT GGCCTGAAGT AACTGAAGAT CCACAAAAGA AGTAAAAACA GCCTTAACTG 780 

ATGACATTCC ACCATTGTGA TTTQTTCCTG CCCCACCCTA ACTGATCAAT GTACTTTGTA 840 

ATCTCCCCCA CCCTTAAGAA GGTTCTTTGT AATTCTCCCC ACCCTTGAGA ATGTACTTTG 900 

TGAGATCCAC CCCTGCCCAC CAGAGAACAA CCCCCTTTGA TTGTAATTTT TTATTACCTT 960 

CCCAAATCCT ATAAAACAGC CCCACCCCTA TCTTCCTTCA CTGACTCTCT TTTCGGACTC 1020 
AGCCACCGGC ACCCAGGTGA AATAAACAGC TTTATTGCTC AC 

Seq ID NO: 359 Protein sequence 
Protein Accession ft: AAA65999 

1 11 21 31 41 51 

I I I I I I 

PKKHLTNFKS DLFGLATEDW RCPIASEVPW TITEAELRVT LTVEGKSIPC LIDTGATHST 60 

LPSFQGPVSL APITWGIDG QASKPLKTPP LWCQLGQHSF MHSFLVIPTC PLPLLGRNIL 120 
TKLSASLTIP GVQLHLIAAL LPNPKPPLCP LTSPQYQPLP QDLPSA 

Seq ID NO: 360 DNA sequence 
Nucleic Acid Accession #: NMJJ01854 
Coding sequence: 162-5582 

1 11 21 31 41 51 

I I I I I I 

AACCATCAAA TTTAGAAGAA AAAGCCCTTT GACTTTTTCC CCCTCTCCCT CCCCAATGGC 60 

TGTGTAGCAA ACATCCCTGG CGATACCTTG GAAAGGACOA AGTTGGTCTG CAGTCGCAAT 120 

TTCGTGGGTT GAGTTCACAG TTGTGAGTGC GGGGCTCGGA GATGGAGCCG TGGTCCTCTA 180 

GGTGGAAAAC GAAACGGTGG CTCTGGGATT TCACCGTAAC AACCCTCGCA TTGACCTTCC 240 

TCTTCCAAGC TAGAGAGGTC AGAGGAGCTG CTCCAGTTGA TGTACTAAAA GCACTAGATT 300 

TTCACAATTC TCCAGAGGGA ATATCAAAAA CAACGGGATT TTGCACAAAC AGAAAGAATT 360 

CTAAAGGCTC AGATACTGCT TACAGAGTTT CAAAGCAAGC ACAACTCAGT GCCCCAACAA 420 

AACAGTTATT TCCAGGTGGA ACTTTCCCAG AAGACTTTTC AATACTATTT ACAGTAAAAC 480 

CAAAAAAAGG AATTCAGTCT TTCCTTTTAT CTATATATAA TGAGCATGGT ATTCAGCAAA 540 

TTGGTGTTGA GGTTGGGAGA TCACCTGTTT TTCTGTTTGA AGACCACACT GGAAAACCTG 600 

CCCCAGAAGA CTATCCCCTC TTCAGAACTG TTAACATCGC TGACGGGAAG TGGCATCGGG 660 

TAGCAATCAG CGTGGAGAAG AAAACTGTGA CAATGATTGT TGATTGTAAG AAGAAAACCA 720 

CGAAACCACT TGATAGAAGT GAGAGAGCAA TTGTTGATAC CAATGGAATC ACGGTTTTTG 780 

GAACAAGGAT TTTGGATGAA GAAGTTTTTG AGGGGGACAT TCAGCAGTTT TTGATCACAG 840 

GTGATCCCAA GGCAGCATAT GACTACTGTG AGCATTATAG TCCAGACTGT GACTCTTCAG 900 

CACCCAAGGC TGCTCAAGCT CAGGAACCTC AGATAGATGA GTATGCACCA GAGGATATAA . 960 

TCGAATATGA CTATGAGTAT GGGGAAGCAG AGTATAAAGA GGCTGAAAGT GTAACAGAGG 1020 

GACCCACTGT AACTGAGGAG ACAATAGCAC AGACGGAGGC AAACATCGTT GATGATTTTC 1080 

AAGAATACAA CTATGGAACA ATGGAAAGTT ACCAGACAGA AGCTCCTAGG CATGTTTCTG 1140 

GGACAAATGA GCCAAATCCA GTTGAAGAAA TATTTACTGA AGAATATCTA ACGGGAGAGG 1200 

ATTATGATTC CCAGAGGAAA AATTCTGAGG ATACACTATA TGAAAACAAA GAAATAGACG 1260 

GCAGGGATTC TGATCTTCTG GTAGATGGAG ATTTAGGCGA ATATGATTTT TATGAATATA 1320 

AAGAATATGA AGATAAACCA ACAAGCCCCC CTAATGAAGA ATTTGGTCCA GGTGTACCAG 1380 

CAGAAACTGA TATTACAGAA ACAAGCATAA ATGGCCATGG TGCATATGGA GAGAAAGGAC 1440 

AGAAAGGAGA ACCAGCAGTG GTTGAGCCTG GTATGCTTGT CGAAGGACCA CCAGGACCAG 1500 

CAGGACCTGC AGGTATTATG GGTCCTCCAG GTCTACAAGG CCCCACTGGA CCCCCTGGTG 1560 

ACCCTGGCGA TAGGGGCCCC CCAGGACGTC CTGGCTTACC AGGGGCTGAT GGTCTACCTG 1620 

GTCCTCCTGG TACTATGTTG ATGTTACCGT TCCGTTATGG TGGTGATGGT TCCAAAGGAC 1680 

CAACCATCTC TGCTCAGGAA GCTCAGGCTC AAGCTATTCT TCAGCAGGCT CGGATTGCTC 1740 

TGAGAGGCCC ACCTGGCCCA ATGGGTCTAA CTGGAAGACC AGGTCCTGTG GGGGGGCCTG 1800 

GTTCATCTGG GGCCAAAGGT GAGAGTGGTG ATCCAGGTCC TCAGGGCCCT CGAGGCGTCC 1860 

AGGGTCCCCC TGGTCCAACG GGAAAACCTG GAAAAAGGGG TCGTCCAGGT GCAGATGGAG 1920 

GAAGAGGAAT GCCAGGAGAA CCTGGGGCAA AGGGAGATCG AGGGTTTGAT GGACTTCCGG 1980 

GTCTGCCAGG TGACAAAGGT CACAGGGGTG AACGAGGTCC TCAAGGTCCT CCAGGTCCTC 2040 

CTGGTGATGA TGGAATGAGG GGAGAAGATG GAGAAATTGG ACCAAGAGGT CTTCCAGGTG 2100 

AAGCTGGCCC ACGAGGTTTG CTGGGTCCAA GGGGAACTCC AGGAGCTCCA GGGCAGCCTG 2160 

GTATGGCAGG TGTAGATGGC CCCCCAGGAC CAAAAGGGAA CATGGGTCCC CAAGGGGAGC 2220 

CTGGGCCTCC AGGTCAACAA GGGAATCCAG GACCTCAGGG TCTTCCTGGT CCACAAGGTC 2280 

CAATTGGTCC TCCTGGTGAA AAAGGACCAC AAGGAAAACC AGGACTTGCT GGACTTCCTG 2340 

GTGCTGATGG GCCTCCTGGT CATCCTGGGA AAGAAGGCCA GTCTGGAGAA AAGGGGGCTC 2400 

TGGGTCCCCC TGGTCCACAA GGTCCTATTG GATNNCCGGG CCCCCGGGGA GTAAAGGGAG 2460 

CAGATGGTGT CAGAGGTCTC AAGGGATCTA AAGGTGAAAA GGGTGAAGAT GGTTTTCCAG 2520 

GATTCAAAGG TGACATGGGT CTAAAAGGTG ACAGAGGAGA AGTTGGTCAA ATTGGCCCAA 2580 

GAGGGNAAGA TGGCCCTGAA GGACCCAAAG GTCGAGCAGG CCCAACTGGA GACCCAGGTC 2640 

CTTCAGGTCA AGCAGGAGAA AAGGGAAAAC TTGGAGTTCC AGGATTACCA GGATATCCAG 2700 

GAAGACAAGG TCCAAAGGGT TCCACTGGAT TCCCTGGGTT TCCAGGTGCC AATGGAGAGA 2760 

AAGGTGCACG GGGAGTAGCT GGCAAACCAG GCCCTCGGGG TCAGCGTGGT CCAACGGGTC 2820 

CTCGAGGTTC AAGAGGTGCA AGAGGTCCCA CTGGGAAACC TGGGCCAAAG GGCACTTCAG 2880 

GTGGCGATGG CCCTCCTGGC CCTCCAGGTG AAAGAGGTCC TCAAGGACCT CAGGGTCCAG 2940 

TTGGATTCCC TGGACCAAAA GGCCCTCCTG GACCACCAGG AAGGATGGGC TGCCCAGGAC 3000 

ACCCTGGGCA ACGTGGGGAG ACTGGATTTC AAGGCAAGAC CGGCCCTCCT GGGCCAGGGG 3060 

GAGTGGTTGG ACCACAGGGA CCAACCGGTG AGACTGGTCC AATAGGGGAA CGTGGGTATC 3120 

CTGGTCCTCC TGGCCCTCCT GGTGAGCAAG GTCTTCCTGG TGCTGCAGGA AAAGAAGGTG 3180 

CAAAGGGTGA TCCAGGTCCT CAAGGTATCT CAGGGAAAGA TGGACCAGCA GGATTACGTG 3240 

GTTTCCCAGG GGAAAGAGGT CTTCCTGGAG CTCAGGGTGC ACCTGGACTG AAAGGAGGGG 3300 

AAGGTCCCCA GGGCCCACCA GGTCCAGTTG GCTCACCAGG AGAACGTGGG TCAGCAGGTA 3360 

CAGCTGGCCC AATTGGTTTA CGAGGGCGCC CGGGACCTCA GGGTCCTCCT GGTCCAGCTG 3420 

GAGAGAAAGG TGCTCCTGGA GAAAAAGGTC CCCAAGGGCC TGCAGGGAGA GATGGAGTTC 3480 

AAGGTCCTGT TGGTCTCCCA GGGCCAGCTG GTCCTGCCGG CTCCCCTGGG GAAGACGGAQ 3540 

ACAAGGGTGA AATTGGTGAG CCGGGACAAA AAGGCAGCAA GGGTGGCAAG GGAGAAAATG 3600 

GCCCTCCCGG TCCCCCAGGT CTTCAAGGAC CAGTTGGTGC CCCTGGAATT GCTGGAGGTG 3660 

ATGGTGAACC AGGTCCTAGA GGACAGCAGG GGATGTTTGG GCAAAAAGGT GATGAGGGTG 3720 

CCAGAGGCTT COCTGGACCT CCTGGTCCAA TAGGTCTTCA GGGTCTGCCA GGCCCACCTG 3780 

GTGAAAAAGG TGAAAATGGG GATGTTGGTC CATGGGGGCC ACCTGGTCCT CCAGGCCCAA 3840 
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GAGGCCCTCA AGGTCCCAAT GOAGCTGATO GACCACAAGG ACCCCCAGGT TCTGTTGGTT 3900 

CAGTTG GTG G TGTTGGAGAA AAGGGTGAAC CTGGAGAAGC AGGAAACCCA GGGCCTCCTG 3960 

GGGAAGCAGG TGTAGGCGGT CCCAAAGGAG AAAGAGGAGA GAAAGGGGAA GCTGGTCCAC 4020 

CTGGAGCTGC TGGACCTCCA GGTGCCAAGG GGCCGCCAGG TGATGATGGC CCTAAGGGTA 4080 

ACCCGGGTCC TGTTGGTTTT CCTGGAGATC CTGGTCCTCC TGGGGAACTT GGCCCTGCAG 4140 

GTCAAGATGG TGTTGGTGGT GACAAGGGTG AAGATGGAGA TCCTGGTCAA CCGGGTCCTC 4200 

CTGGCCCATC TGGTGAGGCT GGCCCACCAG GTCCTCCTGG AAAACGAGGT CCTCCTGGAG 4260 

CTGCAGGTGC AGAGGGAAGA CAAGGTGAAA AAGGTGCTAA GGGGGAAGCA GGTGCAGAAG 4320 

GTCCTCCTGG AAAAACCGGC CCAGTCGGTC CTCAGGGACC TGCAGGAAAG CCTGGTCCAG 4380 

AAGGTCTTCG GGGCATCCCT GGTCCTGTGG GAGAACAAGG TCTCCCTGGA GCTGCAGGCC 4440 

AAGATGGACC ACCTGGTCCT ATGGGACCTC CTGGCTTACC TGGTCTCAAA GGTGACCCTG 4500 

GCTCCAAGGG TGAAAAGGGA CATCCTGGTT TAATTGGCCT GATTGGTCCT CCAGGAGAAC 4560 

AAGGGGAAAA AGGTGACCGA GGGCTCCCTG GAACTCAAGG ATCTCCAGGA GCAAAAGGGG 4620 

ATGGGGGAAT TCCTGGTCCT GCTGGTCCCT TAGGTCCACC TGGTCCTCCA GGCTTACCAG 4680 

GTCCTCAAGG CCCAAAGGGT AACAAAGGCT CTACTGGACC CGCTGGCCAG AAAGGTGACA 4740 

GTGGTCTTCC AGGGCCTCCT GGGCCTCCAG GTCCACCTGG TGAAGTCATT CAGCCTTTAC 4800 

CAATCTTGTC CTCCAAAAAA ACGAGAAGAC ATACTGAAGG CATGCAAGCA GATGCAGATG 4860 

ATAATATTCT TGATTACTCG GATGGAATGG AAGAAATATT TGGTTCCCTC AATTCCCTGA 4920 

AACAAGACAT CGAGCATATG AAATTTCCAA TGGGTACTCA GACCAATCCA GCCCGAACTT 4980 

GTAAAGACCT GCAACTCAGC CATCCTGACT TCCCAGATGG TGAATATTGG ATTGATCCTA 5040 

ACCAAGGTTG CTCAGGAGAT TCCTTCAAAG TTTACTGTAA TTTCACATCT GGTGGTGAGA 5100 

CTTGCATTTA TCCAGACAAA AAATCTGAGG GAGTAAGAAT T TCAT CATGG CCAAAGGAGA 5160 

AACCAGGAAG TTGGTTTAGT GAATTTAAGA GGGGAAAACT GCTTTCATAC TTAGATGTTG 5220 

AAGGAAATTC CATCAATATG GTGCAAATGA CATTCCTGAA ACTTCTGACT GCCTCTGCTC 5280 

GGCAAAATTT CACCTACCAC TGTCATCAGT CAGCAGCCTG GTATGATGTG TCATCAGGAA 5340 

GTTATGACAA AGCACTTCGC TTCCTGGGAT CAAATGATGA GGAGATGTCC TATGACAATA 5400 

ATCCTTTTAT CAAAACACTG TATGATGGTT GTACGTCCAG AAAAGGCTAT GAAAAAACTG 5460 

TCATTGAAAT CAATACACCA AAAATTGATC AAGTACCTAT TGTTGATGTC ATGATCAGTG 5520 

ACTTTGGTGA TCAGAATCAG AAGTTCGGAT TTGAAGTTGG TCCTGTTTGT TTTCTTGGCT 5580 

AAGATTAAGA CAAAGAACAT ATCAAATCAA CAGAAAATGT ACCTTGGTGC CACCAACCCA 5640 

TTTTGTGCCA CATGCAAGTT TTGAATAAGG ATGTATGGAA AACAACGCTG CA TATA CAGG 5700 

TACCATTTAG GAAATACCGA TGCCTTTGTG GGGGCAGAAT CACAGACAAA AGCTTTGAAA 5760 

ATCATAAAGA TATAAGTTGG TGTGGCTAAG ATGGAAACAG GGCTGATTCT TGATTCCCAA 5820 

TTCTCAACTC TCCTTTTCCT ATTTGAATTT CTTTGGTGCT GTAGAAAACA AAAAAAGAAA 5880 

AATATATATT CATAAAAAAT ATGGTGCTCA TTCTCATCCA TCCAGGATGT ACTAAAACAG 5940 

TGTGTTTAAT AAATTGTAAT TATTTTGTGT ACAGTTCTAT ACTGTTATCT GTGTCCATTT 6000 

CCAAAACTTG CACGTGTCCC TGAATTCCGC TGACTCTAAT TTATGAGGAT GCCGAACTCT 6060 

GATGGCAATA ATATATGTAT TATGAAAATG AAGTTATGAT TTCCGATGAC CCTAAGTCCC 6120 
TTTCTTTGGT TAATGATGAA ATTCCTTTGT GTGTGTTT 

Seq ID NO: 361 Protein sequence 
Protein Accession #: NP_001845 

1 11 21 31 41 51 

MEPWSSRWKT KRWLWDFTVT TLALTFLFQA REVRGAAPVD VLKALDFHNS PEGISKTTGF 60 

CTNRKNSKGS DTAYRVSKQA GLSAPTKQLF PGGTFPEDFS ILFTVKPKKG IQSFLliSIYN 120 

EHGIQQIGVE VGRSPVFLFE DHTGKPAPED YPLFRTVNIA DGKWHRVAIS VEKKTVTMIV 180 

DCKKKTTKPL DRSERAIVDT NGITVFGTRI LDEEVFEGDI QQFLITGDPK AAYDYCEHYS 240 

PDCDSSAPKA AQAQEPQIDE YAPEDIIEYD YEYGEAEYKE AESVTEGPTV TEETIAQTEA 300 

NIVDDFQEYN YGTMESYQTE APRHVSGTNE PNPVEEIFTE EYLTGEDYDS QRKNSEDTLY 360 

ENKEIDGRDS DLLVDGDLGE YDFYEYKEYE DKPTSPPNEE FGPGVPAETD ITETSINGHG 420 

AYGEKGQKGE PAWEPGMLV EGPPGPAGPA GIMGPPGLQG PTGPPGDPGD RGPPGRPGLP 480 

GADGLPGPPG TMLMLPFRYG GDGSKGPTIS AQEAQAQAIL QQARIALRGP PGPMGLTGRP 540 

GPVGGPGSSG AKGESGDPGP QGPRGVQGPP GPTGKPGKRG RPGADGGRGM PGEPGAKGDR 600 

GFDGLPGLPG DKGHRGERGP QGPPGPPGDD GMRGEDGEIG PRGbPGEAGP RGLLGPRGTP 660 

GAPGQPGMAG VDGPPGPKGN MGPQGEPGPP GQQGNPGPQG LPGPQGPIGP PGEKGPQGKP 720 

GLAGLPGADG PPGHPGKEGQ SGEKGALGPP GPQGPIGXPG PRGVKGADGV RGLKGSKGEK 780 

GEDGFPGFKG DMGLKGDRGE VGQIGPRGXD GPEGPKGRAG PTGDPGPSGQ AGEKGKLGVP 840 

GLPGYPGRQG PKGSTGFPGF PGANGEKGAR GVAGKPGPRG GRGPTGPRGS RGARGPTGKP 900 

GPKGTSGGDG PPGPPGERGP QGPQGPVGFP GPKGPPGPPG RMGCPGHPGQ RGETGFQGKT 960 

GPPGPGGWG PQGPTGETGP IGERGYPGPP GPPGEQGLPG AAGKEGAKGD PGPQGISGKD 1020 

GPAGLRGFPG ERGLPGAQGA PGLKGGEGPQ GPPGPVGSPG ERGSAGTAGP IGLRGRPGPQ 1080 

GPPGPAGEKG APGEKGPQGP AGRDGVQGPV GLPGPAGPAG SPGEDGDKGE IGEPGQKGSK 1140 

GGKGENGPPG PPGLGGPVGA PGIAGGDGEP GPRGQQGMFG QKGDEGARGF PGPPGPIGLQ 1200 

GLPGPPGEKG ENGDVGPWGP PGPPGPRGPQ GPNGADGPQG PPGSVGSVGG VGEKGEPGEA 1260 

GNPGPPGEAG VGGPKGERGE KGEAGPPGAA GPPGAKGPPG DDGPKGNPGP VGFPGDPGPP 1320 

GELGPAGQDG VGGDKGEDGD PGGPGPPGPS GEAGPPGPPG KRGPPGAAGA EGRQGEKGAK 1380 

GEAGAEGPPG KTGPVGPOGP AGKPGPEGLR GIPGPVGEQG LPGAAGQDGP PGPMGPPGLP 1440 

GIiKGDPGSKG EKGHPGLIGL IGPPGEQGEK GDRGLPGTOG SPGAKGDGGI PGPAGPLGPP 1500 

GPPGLPGPQG PKGNKGSTGP AGQKGDSGLP GPPGPPGPPG EVIQPLPILS SKKTRRHTEG 1560 

MQADADDNIL DYSDGMEEIF GSLNSLKQDI EHMKFPMGTQ TNPARTCKDL QLSHPDFPDG 1620 

EYWIDPNQGC SGDSPKVYCN FTSGGETCIY PDKKSEGVRI SSWPKEKPGS WFSEFKRGKL 1680 

LSYLDVEGNS INMVQMTFLK LLTASARQNF TYHCHQSAAW YDVSSGSYDK ALRFLGSNDE 1740 

EMSYDNNPFI KTLYDGCTSR KGYEKTVIEI NTPKIDQVPI VDVMISDFGD QNQKFGFEVG 1800 
PVCFU3 



Seq ID NO: 362 DNA sequence 
Nucleic Acid Accession #: NM_003107 
Coding sequence: 351-1775 

1 11 21 31 41 51 

I | I I I 1 

TTCCCCAGCA TTCGAGAAAC TCCTCTCTAC TTTAGCACGG TCTCCAGACT CAGCCGAGAG 60 
ACAGCAAACT GCAGCGCGGT GAGAGAGCGA GAGAGAGGGA GAGAGAGACT CTCCAGCCTG 120 
GGAACTATAA CTCCTCTGCG AGAGGCGGAG AACTCCTTCC CCAAATCTTT TGGGGACTTT 180 



322 
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TCTCTCTTTA CCCACCTCOG CCCCTGCGAG GAGTTGAGGG GCCAGTTCGG CCGCCGOGCG 240 

OGTCTTCCOG TTCGGCGTGT GCTTGGCCCG GGGAACOGGG AGGGCCCGGC GATCGCX5CGG 300 

CGGCCGCCGC GAGGGTGTGA G0GCGC6TGG GCGCCOGCCG AGCCGAGGCC ATGGTGCAGC 360 

AAACCAACAA TGCCGAGAAC ACGGAAGCGC TGCTGGCCGG OGAGAGCTCG GACTCGGGCG 420 

CCGGCCTCGA GCTGGGAATC GCCTCCTCCC CCACGCCCGG CTCCACCGCC TCCACGGGCG 480 

GCAAGGCCGA CGACCCGAGC TGGTGCAAGA CCCCGAGTGG GCACATCAAG CGACCCATGA 540 

ACGCCTTCAT GGTGTGGTCG CAGATCGAGC GGCGCAAGAT CATGGAGCAG TCGCCCGACA 600 

TGCACAACGC CGAGATCTCC AAGCGGCTGG GCAAACGCTG GAAGCTGCTC AAAGACAGCG 660 

ACAAGATCCC TTTCATTCGA GAGGCGGAGC GGCTGCGCCT CAAGCACATG GCTGACTACC 720 

CCGACTACAA GTACCGGCCC AGGAAGAAGG TGAAGTCCGG CAACGCCAAC TCCAGCTCCT 780 

CGGCCGCCGC CTCCTCCAAG CCGGGGGAGA AGGGAGACAA GGTCGGTGGC AGTGGCGGGG 840 

GCGGCCATGG GGGCGGCGGC GGCGGCGGGA GCAGCAACGC GGGGGGAGGA GGCGGCGGTG 900 

CGAGTGGCGG CGGCGCCAAC TCCAAACCGG CGCAGAAAAA GAGCTGCGGC TCCAAAGTGG 960 

CGGGCGGCGC GGGCGGTGGG GTTAGCAAAC CGCACGCCAA GCTCATCCTG GCAGGCGGCG 1020 

GCGGCGGCGG GAAAGCAGCG GCTGCCGCCG CCGCCTCCTT CGCCGCCGAA CAGGCGGGGG 1080 

CCGCCGCCCT GCTGCCCCTG GGCGCCGCCG CCGACCACCA CTCGCTGTAC AAGGCGCGGA 1140 

CTCCCAGCGC CTCGGCCTCC GCCTCCTCGG CAGCCTCGGC CTCCGCAGCG CTCGCGGCCC 1200 

CGGGCAAGCA CCTGGCGGAG AAGAAGGTGA AGCGCGTCTA CCTGTTCGGC GGCCTGGGCA 1260 

CGTCGTCGTC GCCCGTGGGC GGCGTGGGCG CGGGAGCCGA CCCCAGCGAC CCCCTGGGCC 1320 

TGTACGAGGA GGAGGGCGCG GGCTGCTCGC CCGACGCGCC CAGCCTGAGC GGCCGCAGCA 1380 

GCGCCGCCTC GTCCCCCGCC GCCGGCCGCT CGCCCGCCGA CCACCGCGGC TACGCCAGCC 1440 

TGCGCGCCGC CTCGCCCGCC CCGTCCAGCG CGCCCTCGCA CGCGTCCTCC TCGGCCTCGT 1500 

CCCACTCCTC CTCTTCCTCC TCCTCGGGCT CCTCGTCCTC CGACGACGAG TTCGAAGACG 1560 

ACCTGCTCGA CCTGAACCCC AGCTCAAACT TTGAGAGCAT GTCOCTGGGC AGCTTCAGTT 1620 

CGTCGTCGGC GCTCGACCGG GACCTGGATT TTAACTTCGA GCCCGGCTCC GGCTCGCACT 1680 

TCGAGTTCCC GGACTACTGC ACGCCCGAGG TGAGCGAGAT GATCTCGGGA GACTGGCTCG 1740 

AGTCCAGCAT CTCCAACCTG GTTTTCACCT ACTGAAGGGC GCGCAGGCAG GGAGAAGGGC 1800 

CGGGGGGGGT AGGAGAGGAG AAAAAAAAAG TGAAAAAAAG AAACGAAAAG GACAGACGAA I860 

GAGTTTAAAG AGAAAAGGGA AAAAAGAAAG AAAAAGTAAG CAGGGCTCGT TCGCCCGCGT 1920 

TCTCGTCGTC GGATCAAGGA GCGCGGCGGC GTTTTGGACC CGCGCTCCCA TCCCCCACCT 1980 

TCCCGGGCCG GGGACCCACT CTGCCCAGCC GGAGGGACGC GGAGGAGGAA GAGGGTAGAC 2040 

AGGGGCGACC TGTGATTGTT GTTATTGATG TTGTTGTTGA TGGCAAAAAA AAAAAGCGAC 2100 

TTCGAGTTTG CTCCCCTTTG CTTGAAGAGA CCCCCTCCCC CTTCCAACGA GCTTCCGGAC 2160 

TTGTCTGCAC CCCCAGCAAG AAGGCGAGTT AGTTTTCTAG AGACTTGAAG GAGTCTCCCC 2220 

CTTCCTGCAT CACCACCTTG GTTTTGTTTT ATTTTGCTTC TTGGTCAAGA AAGGAGGGGA 2280 

GAACCCAGCG CACCCCTCCC CCCCTTTTTT TAAACGCGTG ATGAAGACAG AAGGCTCCGG 2340 

GGTGACGAAT TTGGCCGATG GCAGATGTTT TGGGGGAACG CCGGGACTGA GAGACTCCAC 2400 

GCAGGCGAAT TCCCGTTTGG GGCCTTTTTT TCCTCCCTCT TTTCCCCTTG CCCCCTCTGC 2460 

AGCCGGAGGA GGAGATGTTG AGGGGAGGAG GCCAGCCAGT GTGACCGGCG CTAGGAAATG 2520 

ACCCGAGAAC CCCGTTGGAA GCGCAGCAGC GGGAGCTAGG GGCGGGGGCG GAGGAGGACA 2580 

CGAACTGGAA GGGGGTTCAC GGTCAAACTG AAATGGATTT GCACGTTGGG GAGCTGGCGG 2640 

CGGCGGCTGC TGGGCCTCCG CCTTCTTTTC TACGTGAAAT CAGTGAGGTG AGACTTCCCA 2700 

GACCCCGGAG GCGTGGAGGA GAGGAGACTG TTTGATGTGG TACAGGGGCA GTCAGTGGAG 2760 
GGCGAGTGGT TTCGGAAAAA AAAAAAGAAA AAAAGGG 

Seq ID NO: 363 Protein sequence 
Protein Accession ft: NP_003098 

1 11 21 31 41 51 

i i r i i i 

MVQQTNNAEN TEALLAGESS DSGAGLELGI ASSPTPGSTA STGGKADDPS WCKTPSGHIK 60 

RPMNAFMVWS QIERRKIMEQ SPDMHNAEIS KRLGKRWKLL KDSDKIPFIR EAERLRLKHM 120 

ADYPDYKYRP RKKVKSGNAN SSSSAAASSK PGEKGDKVGG SGGGGHGGGG GGGSSNAGGG 180 

GGGASGGGAN SKPAQKKSCG SKVAGGAGGG VSKPHAKLIL AGGGGGGKAA AAAAASFAAE 240 

QAGAAALLPL GAAADHHSLY KARTPSASAS ASSAASASAA LAAPGKHLAE KKVKRVYLFG 300 

GLGTSSSPVG GVGAGADPSD PLGLYEEEGA GCSPDAPSLS GRSSAASSPA AGRSPADHRG 360 

YASLRAASPA PSSAPSHASS SASSHSSSSS SSGSSSSDDE FEDDLLDLNP SSNFESMSLG 420 
SFSSSSALDR DLDFNFEPGS GSHFEFPDYC TPEVSEMISG DWLESSISNL VFTY 



Seq ID NO: 364 DNA sequence 
Nucleic Acid Accession ft: U10860 
Coding sequence: 123-2204 

1 11 21 31 41 51 

111 I I I 

TGCCGGCTGC TCCTCGACCA GGCCTCCTTC TCAACCTCAG CCCGCGGCGC CGACCCTTCC 60 

GGCACCCTCC CGCCCCGTCT CGTACTGTCG CCGTCACCGC CGCGGCTCCG GCCCTGGCCC 120 

CGATGGCTCT GTGCAACGGA GACTCCAAGC TGGAGAATGC TGGAGGAGAC CTTAAGGATG 180 

GCCACCACCA CTATGAAGGA GCTGTTGTCA TTCTGGATGC TGGTGCTCAG TACGGGAAAG 240 

TCATAGACCG AAGAGTGAGG GAACTGTTCG TGCAGTCTGA AATTTTCCCC TTGGAAACAC 300 

CAGCATTTGC TATAAAGGAA CAAGGATTCC GTGCTATTAT CATCTCTGGA GGACCTAATT 360 

CTGTGTATGC TGAAGATGCT CCCTGGTTTG ATCCAGCAAT ATTCACTATT GGCAAGCCTG 420 

TTCTTGGAAT TTGCTATGGT ATGCAGATGA TGAATAAGGT ATTTGGAGGT ACTGTGCACA 480 

AAAAAAGTGT CAGAGAAGAT GGAGTTTTCA ACATTAGTGT GGATAATACA TGTTCATTAT 540 

TCAGGGGCCT TCAGAAGGAA GAAGTTGTTT TGCTTACACA TGGAGATAGT GTAGACAAAG 600 

TAGCTGATGG ATTCAAGGTT GTGGCACGTT CTGGAAACAT AGTAGCAGGC ATAGCAAATG 660 

AATCTAAAAA GTTATATGGA GCACAGTTCC ACCCTGAAGT TGGCCTTACA GAAAATGGAA 720 

AAGTAATACT GAAGAATTTC CTTTATGATA TAGCTGGATG CAGTGGAACC TTCACCGTGC 780 

AGAACAGAGA ACTTGAGTGT ATTCGAGAGA TCAAAGAGAG AGTAGGCACG TCAAAAGTTT 840 

TGGTTTTACT CAGTGGTGGA GTAGACTCAA CAGTTTGTAC AGCTTTGCTA AATCGTGCTT 900 

TGAACCAAGA ACAAGTCATT GCTGTGCACA TTGATAATGG CTTTATGAGA AAACGAGAAA 960 

GCCAGTCTGT TGAAGAGGCC CTCAAAAAGC TTGGAATTCA GGTCAAAGTG ATAAATGCTG 1020 

CTCATTCTTT CTACAATGGA ACAACAACCC TACCAATATC AGATGAAGAT AGAACCCCAC 1080 

GGAAAAGAAT TAGCAAAACG TTAAATATGA CCACAAGTCC TGAAGAGAAA AGAAAAATCA 1140 

TTGGGGATAC TTTTGTTAAG ATTGCCAATG AAGTAATTGG AGAAATGAAC TTGAAACCAG 1200 

AGGAGGTTTT CCTTGCCCAA GGTACTTTAC GGCCTGATCT AATTGAAAGT GCATCCCTTG 1260 
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10 



15 



WO 02/086443 

TTGCAAGTGG CAAAGCTGAA CTCATCAAAA CCCATCACAA TGACACAGAG CTCATCAGAA 1320 

AGTTGAGAGA GGAGGGAAAA GTAATAGAAC CTCTGAAAGA TTTTCATAAA GATGAAGTGA 1380 

GAATTTTGGG CAGAGAACTT GGACTTCCAG AAGAGTTAGT TTCCAGGCAT CCATTTCCAG 1440 

GTCCTGGCCT GGCAATCAGA GTAATATGTG CTGAAGAACC TTATATTTGT AAGGACTTTC 1500 

CTGAAACCAA CAATATTTTG AAAATAGTAG CTGATTTTTC TGCAAGTGTT AAAAAGCCAC 1560 

ATACCCTATT ACAGAGAGTC AAAGCCTGCA CAACAGAAGA GGATCAGGAG AAGCTGATGC 1620 

AAATTACCAG TCTGCATTCA CTGAATGCCT TCTTGCTGCC AATTAAAACT GTAGGTGTGC 1680 

AGGGTGACTG TCGTTCCTAC AGTTACGTGT GTGGAATCTC CAGTAAAGAT GAACCTGACT 1740 

GGGAATCACT TATTTTTCTG GCTAGGCTTA TACCTCGCAT GTGTCACAAC GTTA ACAG AG 1800 

TTGTTTATAT ATTTGGCCCA CCAGTTAAAG AACCTCCTAC AGATGTTACT CCCACTTTCT 1860 

TGACAACAGG GGTGCTCAGT •ACTTTACGCC AAGCTGATTT TGAGGCCCAT AACATTCTCA 1920 

GGGAGTCTGG GTATGCTGGG AAAATCAGCC AGATGCCGGT GATTTTGACA CCATTACATT 1980 

TTGATCGGGA CCCACTTCAA AAGCAGCCTT CATGCCAGAG ATCTGTGGTT ATTCGAACCT 2040 

TTATTACTAG TGACTTCATG ACTGGTATAC CTGCAACACC TGGCAATGAG ATCCCTGTAG 2100 

AGGTGGTATT AAAGATGGTC ACTGAGATTA AGAAGATTCC TGGTATTTCT CGAATTATGT 2160 
ATGACTTAAC ATCAAAGCCC CCAGGAACTA CTGAGTGGGA GTAATAAACT TC 



PCT/US02/12476 



20 



25 



30 



35 



Seq ID NO: 365 Protein sequence 
Protein Accession #: AAA60331 



MALCNGDSKL 
AFAIKEQGFR 
KSVREDGVFN 
SKKLYGAQFH 
VLLSGGVDST 
HSFYNGTTTL 
EVPLAQGTLR 
ILGRELGLPE 
TLLQRVKACT 
ESLIFLARLI 
ESGYAGKISQ 
WLKMVTEIK 



11 
I 

ENAGGDLKDG 
AIIISGGPNS 
ISVDNTCSLF 
PEVGLTENGK 
VCTALLNRAL 
PI SDEDRTPR 
PDLIESASLV 
ELVSRHPFPG 
TEEDQEKLMQ 
PRMCHNVNRV 
MPVILTPLHF 
KIPGISRIMY 



21 
I 

HHHYEGAWI 
VYAEDAPWFD 
RGLQKEEWL 
VILKNFLYDI 
NQEQVIAVHI 
KRISKTLNMT 
ASGKAELIKT 
PGLAIRVICA 
ITSLHSLNAF 
VYIFGPPVKE 
DRDPLQKQPS 
DI/TSKPPGTT 



31 
I 

LDAGAOYGKV 
PAIPTIGKPV 
LTHGDSVDKV 
AGCSGTPTVQ 
DNGFMRKRES 
TSPEEKRKII 
HHNDTELIRK 
EEPYICKDFP 
LLPIKTVGVQ 
PPTDVTPTFL 
CQRSWIRTF 
EWE 



41 
I 

IDRRVRELFV 
DGICYGMQMM 
ADGFKWARS 
NRELECIREI 
QSVEEALKKL 
GDTFVKIANE 
LREEGKVIEP 
ETNNILKIVA 
GDCRSYSYVC 
TTGVLSTLRQ 
ITSDFMTGIP 



51 
I 

QSEIFPLETP 
NKVFGGTVHK 
GNIVAGIANE 
KERVGTSKVL 
GIQVKVINAA 
VIGEMNLKPE 
LKDFHKDEVR 
DFSASVKKPH 
GISSKDEPDW 
ADFEAHNILR 
ATPGNEIPVE 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 



40 
45 
50 
55 



Seq ID NO: 366 DNA sequence 
Nucleic Acid Accession #: NMJJ04219 
Coding sequence: 46-654 



I 

GCGGCCTCAG 
TATGTTGATA 
CTGGGGTCTG 
TTTGGCAAAA 
ACTGTCAACA 
CCAAGCTTTT 
GCCTCAGATG 
GAGAGTTTTG 
CTCATGATCC 
CCTGTGAAGA 
CTGTCGACCC 
TAGTGCTTCA 
AAAAAAAA 



11 
I 

ATGAATGCGG 
AGGAAAATGG 
GACCTTCAAT 
CGTTCGATGC 
GAGCTACAGA 
CTGCCAAAAA 
ATGCCTATCC 
ACCTGCCTGA 
TTGACGAGGA 
TGCCCTCTCC 
TGGATGTTGA 
GAGTTTGTGT 



21 

I 

CTGTTAAGAC 
AGAACCAGGC 
CAAAGCCTTA 
CCCACCAGCC 
AAAGTCTGTA 
GATGACTGAG 
AGAAATAGAA 
AGAGCACCAG 
GAGAGAGCTT 
ACCATGGGAA 
ATTGCCACCT 
GTATTTGTAT 



31 
I 

CTGCAATAAT 
ACCCGTGTGG 
GATGGGAGAT 
TTACCTAAAG 
AAGACCAAGG 
AAGACTGTTA 
AAATTCTTTC 
ATTGCGCACC 
GAAAAGCTGT 
TCCAATCTGT 
GTTTGCTGTG 
TAATAAAGCA 



41 
I 

CCAGAATGGC 
TTGCTAAGGA 
CTCAAGTTTC 
CTACTAGAAA 
GACCCCTCAA 
AAGCAAAAAG 
CCTTCAATCC 
TCCCCTTGAG 
TTCAGCTGGG 
TGCAGTCTCC 
ACATAGATAT 
TTCTTCAACA 



51 
i 

TACTCTGATC 
TGGGCTGAAG 
AACACCACGT 
GGCTTTGGGA 
ACAAAAACAG 
CTCTGTTCCT 
TCTAGACTTT 
TGGAGTGCCT 
CCCCCCTTCA 
TTCAAGCATT 
TTAAATTTCT 
GAAAAAAAAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



Seq ID NO: 367 Protein sequence 
Protein Accession #: NP_004210 

60 1 11 21 31 41 51 

I I I I I I 

MATLIYVDKE NGEPGTRWA KDGLKLGSGP SIKALDGRSQ VSTPRFGKTF DAPPALPKAT 
RKALGTVNRA TEKSVKTKGP LKQKQPSFSA KKMTEKTVKA KSSVPASDDA YPEIEKFFPF 
NPLDFESFDL PEEHQIAHLP LSGVPLMILD EERELEKLFQ LGPPSPVKMP SPPWESNLU5 
65 SPSSILSTLD VELPPVCCDI DI 



60 
120 
180 



70 
75 
80 
85 



Seq ID NO i 368 DNA sequence 
Nucleic Acid Accession #: NM_000597 
Coding sequence: 118-1104 



ATTCGGGGCG 
CCTGCCCGCC 
CTGCCGAGAG 
CCGCTGCTGC 
CTGTTCCGCT 
GCGCCGCCCG 
GTCCGGGAGC 
GGCGTCTACA 
CTGCCCCTGC 
TATGGCGCCA 
GTGGAGAACC 
AAGCCCCTCA 
CACCGGCAGA 
CGACCACCCC 



11 
I 

AGGGAGGAGG 
CGCCCGCTCG 
TGGGCTGCCC 
TGCTGCTACT 
GCCCGCCCTG 
CCGCGGTGGC 
CGGGCTGCGG 
CCCCGCGCTG 
AGGCGCTGGT 
GCCCGGAGCA 
ACGTGGACAG 
AGTCGGGTAT 
TGGGCAAGGG 
CTGCCAGGAC 



21 
I 

AAGAAGCGGA 
CTCGCTCGCC 
CGCGCTGCCG 
GGGCGCGAGT 
CACACCCGAG 
CGCAGTGGCC 
CTGCTGCTCG 
CGGCCAGGGG 
CATGGGCGAG 
GGTTGCAGAC 
CACCATGAAC 
GAAGGAGCTG 
TGGCAAGCAT 
TCCCTGCCAA 



31 
I 

GGAGGCGGCT 
OGCCGCGCCG 
CTGCCGCCGC 
GGCGGCGGCG 
CGCCTGGCCG 
GGAGGCGCCC 
GTGTGCGCCC 
CTGCGCTGCT 
GGCACTTGTG 
AATGGCGATG 
ATGTTGGGCG 
GCCGTGTTCC 
CACCTTGGCC 
CAGGAACTGG 



41 

I 

CCCGCTCGCA 
CGCTGCCGAC 
CGCCGCTGCT 
GCGGGGCGCG 
CCTGCGGGCC 
GCATGCCATG 
GGCTGGAGGG 
ATCCCCACCC 
AGAAGCGCCG 
ACCACTCAGA 
GGGGAGGCAG 
GGGAGAAGGT 
TGGAGGAGCC 
ACCAGGTCCT 



51 

I 

GGGCCGTGCA 
CGCCAGCATG 
GCCGCTGCTG 
CGCGGAGGTG 
CCCGCCGGTT 
CGCGGAGCTC 
CGAGGCGTGC 
GGGCTCCGAG 
GGACGCCGAG 
AGGAGGCCTG 
TGCTGGCCGG 
CACTGAGCAG 
CAAGAAGCTG 
GGAGCGGATC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
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TCCACCATGC GCCTTCCGGA TGAGCGGGGC CCTCTGGAGC 
CCCAACTGTG ACAAGCATGG CCTGTACAAC CTCAAACAGT 
CAGCGTGGGG AGTGCTGGTG TGTGAACCCC AACACCGGGA 
ACCATCCGGG GGGACCCCGA GTGTCATCTC TTCTACAATG 
GTGCACACCC AGCGGATGCA GTAGACCGCA GCCAGCCGGT 
GCCCCTCTCC AAACACCGGC AGAAAACGGA GAGTGCTTGG 
TTCCAGTTCT GACACACGTA TTTATATTTG GAAAGAGACC 
CCCGGCCTCT CTCTTCCCAG CTGCAGATGC CACACCTGCT 
GAGGAAGGGG GTTGTGGTCG GGGAGCTGGG GTACAGGTTT 
TTTATTTTTG AACCCCTGTG TCCCTTTTGC ATAAGATTAA 

Seq ID NO: 369 Protein sequence 
Protein Accession 8: NP_000588 



ACCTCTACTC 
GCAAGATGTC 
AGCTGATCCA 
AGCAGCAGGA 
GCCTGGCGCC 
GTGGTGGGTG 
AGCACCGAGC 
CCTTCTTGCT 
GGGGAGGGGG 
AGGAAGGAAA 



CCTGCACATC 
TCTGAACGGG 
GGGAGCCCCC 
GGCTTGCGGG 
CCTGCCCCCC 
CTGGAGGATT 
TCGGCACCTC 
TTCCCCGGGG 
AAGAGAAATT 
AGT 



MLPRVGCPAL 
VAPPAAVAAV 
ELPLQALVMG 
RKPLKSGMKE 
ISTMRLPDER 
PTIRGDPECH 



11 
I 

PLPPPPLLPL 
AGGARMPCAE 
EGTCEKRRDA 
LAVFREKVTE 
GPLEHLYSLH 
LFYNEQQEAC 



21 
I 

LPLLLLLLGA 
IiVREPGCGCC 
EYGASPEQVA 
QHRQMGKGGK 
IPNCDKHGLY 
GVHTQRMQ 



31 



41 



SGGGGGARAE VLPRCPPCTP 
SVCARLEGEA CGVYTPRCGQ 
DNGDDHSEGG LVENHVDSTM 
HHLGLEBPKK LRPPPARTPC 
NLKQCKMSLN GQRGECWCVN 



51 

I 

ERLAACGPPP 
GLRCYPHPGS 
NMLGGGGSAG 
QQELDQVLER 
PNTGKLIQGA 



Seq ID NO: 370 DNA sequence 
Nucleic Acid Accession &: NMJ)04264 
Coding sequence: 6-440 



GGAACATGGC 
TTTGTAATGC 
AGACAGCAAT 
CAGCACTGAT 
AAGAATCTAC 
AAGCTGCTAC 
AAAGCGCACT 
AGTCTCTTCC 
GTGCCATTAA 
TTAAACACTA 
GATAAGCTTA 
GAGTGAAATT 
AATTCTGTTA 
C 



11 
I 

GGATCGGCTC 
CATTGGAGTA 
TAACAAAGAC 
TGCACGAACA 
AGCTGCTTTA 
ATGTGTGGAG 
TGCTGATATT 
AGACTCATAG 
GAATTCTGCA 
TGACACATTA 
TAAATCATGA 
ATTAAGGCAT 
TGACATAATT 



21 
I 

ACGCAGCTTC 
TTGCAGCAAT 
CAGCCAGCTA 
GCAAAAGACA 
CAGGCTGCTA 
GATGTTGTTT 
GCACAGTCAC 
CATCAGTGGA 
TCAGACTTAG 
CCTTTTTAGC 
TTGAATCAGC 
GTAATACATT 
TATGTCTCCA 



31 
I 

AGGACGCTGT 
GTGGTCCTCC 
ACCCTACAGA 
TTGATGTTTT 
GCTTGTATAA 
ATCGAGGAGA 
AGCTGAAGAC 
TACCATGTGG 
ATACAAGCCT 
TATTTTTAAT 
TTTAAAGCAT 
AATGAACATA 
TTTTGTTGTA 



41 

I 

GAATTCGCTT 
TGCCTCTTTC 
AGAGTATGCC 
GATAGATTCC 
GCTAGAAGAA 
CATGCTTCTG 
AAGAAGTGGT 
CTGAGAAAAG 
TACCAACAAT 
AGTCTTCTAT 
CATACCATCA 
ATATAAGGAA 
TTGGCCAGTA 



51 

I 

GCAGATCAGT 
AATAATATTC 
CAGCTTTTTG 
TTACCCAGTG 
GAAAACCATG 
GAGAAGATAC 
ACCCATAGCC 
AACTGTTTGA 
TACAGAAACA 
TTTCACTCTT 
TTTTTTAACT 
ACATATGTAA 
CTTTTACAAT 



Seq ID NO: 371 Protein sequence 
Protein Accession 8: NP_004255 



11 



21 



31 



41 



51 



I I I I I I 

MADRLTQLQD AVNSLADQFC NAIGVLQQCG PPASFNNIQT AINKDQPANP TEEYAQLFAA 
LIARTAKDID VLIDSLPSEE STAALQAASL YKLEEENHEA ATCVEDWYR GDMLLEKIQS 
ALADIAQSQL KTRSGTHSQS LPDS 

Seq ID NO: 372 DNA sequence 
Nucleic Acid Accession #: AJ271091 
Coding sequence: 1-1113 



1 
I 

ATGGAGAATC 
CTGCGCGTGG 
CATTTCAAAG 
TTCTTAGACC 
ACAGTACAGA 
CTGTTTTTGG 
AGAGCTAAGG 
ACTCTTACAA 
TTCTCCTGGA 
TATGACACAT 
GAAACTATCA 
CTTCTTGGAA 
AAAGCTGTGG 
TTCTACATGC 
CTGTGGATTC 
ATTCCAATAT 
AAAGTTAGAT 
ATAAATTTTC 
CATGCCTGTG 



11 

I 

AGGTGTTGAC 
AGCTGAGTGA 
CTCAAGGACA 
TTGTGAAACC 
AGAAAGTGAG 
CTCCTGACTT 
AAGAAGAGCG 
ACTTAAGGAA 
TCTTTGTCAA 
TCCATACTGT 
ATGCAGCAAT 
GAAATTTTAT 
TTTTCTTTGT 
TGACGTGCAT 
CCTTATATCC 
TCAATGAGAC 
TTTCCTTTTT 
GTCACCTTTA 
ATCCCAGCGC 



21 
I 

GCCGCATGTC 
CGTACAGAAC 
TGGTGCCAAA 
AGAGCCTGTT 
TCAGTGGTGG 
TGATCGTTGG 
CCTAAATAAA 
AGGATACCTG 
CCTGACTGTG 
GGCTGACATG 
TGGAGTCACT 
TTTGTTTATC 
GTTTTATTTG 
TGACATGGAT 
ACTGGGATGT 
CGGACGATTC 
TCTTCAGATT 
TAAACAGCGC 
TTTGGGAGGC 



31 
I 

TACTGGGCTC 
CCTGCCATCA 
GGAGACAATG 
TACAAACTGA 
GAGAGACTCA 
CTGGATGAAT 
CTCCGACTGG 
TTTATGTATA 
CGATTCTGTA 
ATGTATTTCT 
ACGTCACCGG 
ATCTTTGGCA 
TGGAGTGCAA 
TGGAAGGTGC 
TTGGCGGAAG 
AGTTTCACAT 
TATCTTATAA 
AGACTGAAAA 
TGA 



41 
I 

AGCGACACCG 
GCATCACTGA 
TCTATGAATT 
CCCAGAGGCA 
CAAAGCAGGA 
CTGATGCGGA 
AAAGCGAAGG 
ATCTTGTGCA 
TCTTGGGAAA 
GCCAGATGCT 
TGCTGCCTTC 
CCATGGAAGA 
TTGAAATTTT 
TCACATGGCT 
CTGTCTCAGT 
TGCCATATCC 
TGATATTTTT 
TGAGGGCAGG 



51 

I 

CGAGCTATAT 
AAACGTGCTG 
TCACCTGGAG 
GGTAAACATT 
AAAGCGACCA 
AATGGAGCTC 
CTCTCCTGAA 
ATTCTTGGGA 
AGAGTCCTTT 
GGCAGTTGTG 
TCTGATCCAG 
AATGCAGAAC 
CAGGTACTCT 
TCGTTACACT 
GATTCAGTCC 
AGTGAAAATC 
AGGTTTATAC 
CGCAGTGGCT 



Seq ID NO: 373 Protein sequence 
Protein Accession #: CAB69070 



900 
960 
1020 
10B0 
1140 
1200 
1260 
1320 
1380 



PCT/US02/12476 



1 11 21 31 41 51 

I I I I I I 

MENQVLTPHV YWAQRHRELY LRVELSDVQN PAISITENVL HFKAQGHGAK GDNVYEFHLE 
FLDLVKPEPV YKLTQRQVNI TVQKKVSQWW ERLTKQEKRP LFLAPDFDRW LDESDAEMEL 
RAKEEERLNK LRLESEGSPE TLTNLRKGYL FMYNLVQFLG FSWIFVNLTV RFCILGKESF 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



60 
120 
180 



325 



WO 02/086443 

YDTFHTVADM MYFOQMLAW ETINAAIGVT TSPVLPSLIQ LLGRNFILFI IFGTMEEMON 240 
KAWPFVFYL WSAIEIFRYS PYMLTCIDMD WKVLTWLRYT LWIPLYPLGC LAEAVSVIQS 300 
IPIFNETGRF SFTLPYPVKI KVRFSFFLQI YLIMIPLGLY INFRHLYKQR RLKMRAGAVA 360 
HACDPSALGG 



Seq ID NO: 374 DNA sequence 
Nucleic Acid Accession &: NM_016395 
Coding sequence: 1-1113 

1 11 21 31 41 51 

I I I I I I 

ATGGAGAATC AGGTGTTGAC GCCGCAT6TC TACTGGGCTC AGCGACACCG CGAGCTATAT 60 

CTGCGCGTGG AGCTGAGTGA CGTACAGAAC CCTGCCATCA GCATCACTGA AAACGTGCTG 120 

CATTTCAAAG CTCAAGGACA TGGTGCCAAA GGAGACAATG TCTATGAATT TCACCTGGAG 180 

TTCTTAGACC TTGTGAAACC AGAGCCTGTT TACAAACTGA CCCAGAGGCA GGTAAACATT 240 

ACAGTACAGA AGAAAGTGAG TCAGTGGTGG GAGAGACTCA CAAAGCAGGA AAAGCGACCA 300 

CTGTTTTTGG CTCCTGACTT TGATCGTTGG CTGGATGAAT CTGATGCGGA AATGGAGCTC 360 

AGAGCTAAGG AAGAAGAGCG CCTAAATAAA CTCCGACTGG AAAGCGAAGG CTCTCCTGAA 420 

ACTCTTACAA ACTTAAGGAA AGGATACCTG TTTATGTATA ATCTTGTGCA ATTCTT GGGA 480 

TTCTCCTGGA TCTTTGTCAA CCTGACTGTG CGATTCTGTA TCTTGGGAAA AGAGTCCTTT 540 

TATGACACAT TCCATACTGT GGCTGACATG ATGTATTTCT GCCAGATGCT GGCAGTTGTG 600 

GAAACTATCA ATGCAGCAAT TGGAGTCACT ACGTCACCGG TGCTGCCTTC TCTGATCCAG 660 

CTTCTTGGAA GAAATTTTAT TTTGTTTATC ATCTTTGGCA CCATGGAAGA AATGCAGAAC 720 

AAAGCTGTGG TTTTCTTTGT GTTTTATTTG TGGAGTGCAA TTGAAATTTT CAGGTACTCT 780 

TTCTACATGC TGACGTGCAT TGACATGGAT TGGAAGGTGC TCACATGGCT TCGTTACACT 840 

CTGTGGATTC CCTTATATCC ACTGGGATGT TTGGCGGAAG CTGTCTCAGT GATTCAGTCC 900 

ATTCCAATAT TCAATGAGAC CGGACGATTC AGTTTCACAT TGCCATATCC AGTGAAAATC ST60 

AAAGTTAGAT TTTCCTTTTT TCTTCAGATT TATCTTATAA TGATATTTTT AGGTTTATAC 1020 

ATAAATTTTC GTCACCTTTA TAAACAGCGC AGACTGAAAA TGAGGGCAGG CGCAGTGGCT 1080 
CATGCCTGTG ATCCCAGCGC TTTGGGAGGC TGA 



Seq ID NO: 375 Protein sequence 
Protein Accession ft: NP_057479 

1 11 21 31 41 51 

| | I I I I 

MENQVLTPHV YWAQRHRELY LRVELSDVQN PAISITENVL KFKAQGHGAK GDNVYEFHLE 60 
FLDLVKPEPV YKLTQRQVNI TVQKKVSQWW ERLTKQEKRP LFLAPDFDRW LDESDAEMEL 120 
RAKEEERLNK LRLESEGSPE TLTNLRKGYIj FMYNLVQFLG FSWIFVNLTV RFCILGKESF 180 
YDTFHTVADM MYFCQMLAW ETINAAIGVT TSPVLPSLIQ LLGRNFILFI IFGTMEEMQN 240 
KAWPFVFYL WSAIEIFRYS FYMLTCIDMD WKVLTWLRYT LWIPLYPLGC LVEAVSVIQS 300 
IPIFNETGRF SFTLPYPVKI KVRFSFFLQI YLIMIFLGLY INFRHLYKQR RRRYGKKRKR 360 
STKKKDLDGF LPV 

Seq ID NO: 376 DNA sequence 
Nucleic Acid Accession ft: NM_005987 
Coding sequence: 1-270 

1 11 21 31 41 51 

| | I I I I 

ATGAATTCTC AGCAGCAGAA GCAGCCTTGC ACCCCACCCC CTCAGCCTCA GCAGCAGCAG 60 
GTGAAACAAC CTTGCCAGCC TCCACCCCAG GAACCATGCA TCCCCAAAAC CAAGGAGCCC 120 
TGCCAACCCA AGGTGCCTGA GCCCTGCCAC CCCAAAGTGC CTGAGCCCTG CCAGCCCAAG 180 
ATTCCAGAGC CCTGCCAGCC CAAGGTGCCT GAGCCCTGCC CTTCAACGGT CACTCCAGCA 240 
CCAGCCCAGC AGAAGACCAA GCAGAAGTAA 

Seq ID NO: 3 77 Protein sequence 
Protein Accession ft: NP_005978 

1 11 21 31 41 51 

I I I I I I 

MNSQQQKQPC TPPPQPQQQQ VKQPCQPPPQ EPCIPKTKEP CQPKVPEPCH PKVPEPCQPK 60 
IPEPCQPKVP EPCPSTVTPA PAQQKTKQK 



Seq ID NO: 378 DNA sequence 
Nucleic Acid Accession ft: NM_002105 
Coding sequence: 74-505 

1 11 21 31 41 51 

I I I I I I 

ACAGCAGTTA CACTGCGGCG GGCGTCTGTT CTAGTGTTTG AGCCGTCGTG CTTCACCGGT 60 

CTACCTCGCT AGCATGTCGG GCCGCGGCAA GACTGGCGGC AAGGCCCGCG CCAAGGCCAA 120 

GTCGCGCTCG TCGCGCGCCG GCCTCCAGTT CCCAGTGGGC CGTGTACACC GGCTGCTGCG 180 

GAAGGGCCAC TACGCCGAGC GCGTTGGOGC CGGCGCGCCA GTGTACCTGG CGGCAGTGCT 240 

GGAGTACCTC ACCGCTGAGA TCCTGGAGCT GGCGGGCAAT GCGGCCCGCG ACAACAAGAA 300 

GACGCGAATC ATCCCCCGCC ACCTGCAGCT GGCCATCCGC AACGACGAGG AGCTCAACAA 360 

GCTGCTGGGC GGCGTGACGA TCGCCCAGGG AGGCGTCCTG CCCAACATCC AGGCCGTGCT 420 

GCTGCCCAAG AAGACCAGCG CCACCGTGGG GCCGAAGGCG CCCTCGGGCG GCAAGAAGGC 480 

CACCCAGGCC TCCCAGGAGT ACTAAGAGGG CCCGCGCCGC GGCCGGCCGC CCCAGCTCCC 540 

CATGCCACCA CAAAGGCCCT TTTAAGGGCC ACCACCGCCC TCATGGAAAG AGCTGAGCCG 600 

CTTCAGACTG CGGGGCAAGC GGGCCGCGGC TCCCTTCCCC TCCCCTCCCC TCGCCCGCCT 660 

TCGCCGCCCG GCCTCGAGTC CCCGCCCGCC CCCGCTCCCG TCCCGCACCG CCTGCCGCGT 720 

CGGCCTCGGG CCTGCCCTGT CCGCCGTCCG CCCTCCGGTA GGGTTCGGGC CTTCCGGATG 780 

CGGCTTGGGC GCTCTTCGGG GACCTCCGTG GCGCGGAAGA CCCGAGCCTG CCGGGGGGAG 840 



326 



WO 02/086443 

GCCGGCGGCG CCGCACCTGC CCGCCTCGGC GTTCGTGACT CAGCCGCCCC ATCCCGAGTC 900 

GCTAAGGGGC TGCGGGQAGG CCGCAGCACC TTCTGGAAGA CTTGGCCTTC CGCTCTGACG 960 

CAGGGCCGAG GTGGGCAGTC CAGGCCGAGA GCCGGCGGCC CTGAAGGTGA GTGAGGCCCT 1020 

CGGCAGCTGC AGCCGGGGTG TCTGGTACCC CCCCGGCGTG GTGCTTAGCC CAGGACTTTC 1080 

AGACGGCCGC TGGCCGGGAG GCTTTGGTGG GAGAGACGCG ATCGCCGATT TCGGTCTGGC 1140 

GCCCCTTCTG CGGCCGGGAC CCAGGCCTTT CACATCAGCT CTCCCTCCAT CTTCATTCAT 1200 

AGGTCTGCGC TGGGGCCGGG ACGAAGCACT TGGTAACAGG CACATCTTCC TCCCGAGTGA 1260 

CTGCCTCCTA GGAGGACATT TAGGGGAGGG CAGAGGCCTG CAGTTTGGCT TCACGGCTGG 1320 

CTATGTGGAC AGCAAGAGTC GTTTTGCGGA ACGCGACTGG CAGCCAGGCC TGTCGGGCCC 1380 

CCGACGCCGC CCCATTTCCC TTCCAGCAAA CTCAACTCGG CAATCCAAGC ACCTAGATAC 1440 

CAGCACAAGT CGGTTAATCC CTGTCTGGAC TGAGCCTCCG TTGGCTTCTG AACTGGAATT 1500 

CTGCAGCTAA CCCTTCCACG ACTAGAACCT TAGGCATTGG GGAGTTTTAG ATGGACTAAT 1560 
TTTATTAAAG GATTGTTTTT TTTTT 

Seq ID NO: 379 Protein sequence 
Protein Accession #* NP_002096 

1 11 21 31 41 51 

I I I I I I 

MSGRGKTGGK ARAKAKSRSS RAGLQFPVGR VHRLLRKGHY AERVGAGAPV YLAAVLEYLT 60 

AEILELAGNA ARDNKKTRII PRHLQLAIRN DEELNKLLGG VTIAQGGVLP NIQAVLLPKK 120 
TSATVGPKAP SGGKKATQAS QEY 



Seq ID NO: 380 DNA sequence 
Nucleic Acid Accession 8: AL136942 
Coding sequence: 184-864 

X 11 21 31 41 51 

I I I I I I 

ACGCGTCCGG CAGAAGCTCG GAGCTCTCGG GGTATCGAGG AGGCAGGCCC GCGGGCGCAC 60 

GGGCGAGCGG GCCGGGAGCC GGAGCGGCGG AGGAGCCGGC AGCAGCGGCG CGGCGGGCTC 120 

CAGGCGAGGC GGTCGACGCT CCTGAAAACT TGCGCGCGCG CTCGCGCCAC TGCGCCCGGA 180 

GCGATGAAGA TGGTCGCGCC CTGGACGCGG TTCTACTCCA ACAGCTGCTG CTTGTGCTGC 240 

CATGTCCGCA CCGGCACCAT CCTGCTCGGC GTCTGGTATC TGATCATCAA TGCTGTGGTA 300 

CTGTTGATTT TATTGAGTGC CCTGGCTGAT CCGGATCAGT ATAACTTTTC AAGTTCTGAA 360 

CTGGGAGGTG ACTTTGAGTT CATGGATGAT GCCAACATGT GCATTGCCAT TGCGATTTCT 420 

CTTCTCATGA TCCTGATATG TGCTATGGCT ACTTACGGAG CGTACAAGCA ACGCGCAGCC 480 

TGGATCATCC CATTCTTCTG TTACCAGATC TTTGACTTTG CCCTGAACAT GTTGGTTGCA 540 

ATCACTGTGC TTATTTATCC AAACTCCATT CAGGAATACA TACGGCAACT GCCTCCTAAT 600 

TTTCCCTACA GAGATGATGT CATGTCAGTG AATCCTACCT GTTTGGTCCT TATTATTCTT 660 

CTGTTTATTA GCATTATCTT GACTTTTAAG GGTTACTTGA TTAGCTGTGT TTGGAACTGC 720 

TACCGATACA TCAATGGTAG GAACTCCTCT GATGTCCTGG TTTATGTTAC CAGCAATGAC 780 

ACTACGGTGC TGCTACCCCC GTATGATGAT GCCACTGTGA ATGGTGCTGC CAAGGAGCCA 840 

CCGCCACCTT ACGTGTCTGC CTAAGCCTTC AAGTGGGCGG AGCTGAGGGC AGCAGCTTGA 900 

CTTTGCAGAC ATCTGAGCAA TAGTTCTGTT ATTTCACTTT TGCCATGAGC CTCTCTGAGC 960 

TTGTTTGTTG CTGAAATGCT ACTTTTTAAA ATTTAGATGT TAGATTGAAA ACTGTAGTTT 1020 

TCAACATATG CTTTGCTAGA ACACTGTGAT AGATTAACTG TAGAATTCTT CCTGTACGAT 1080 

TGGGGATATA ACGGGCTTCA CTAACCTTCC CTAGGCATTG AAACTTCCCC CAAATCTGAT 1140 

GGACCTAGAA GTCTGCTTTT GTACCTGCTG GGCCCCAAAG TTGGGCATTT TTCTCTCTGT 1200 

TCCCTCTCTT TTGAAAATGT AAAATAAAAC CAAAAATAGA CAACTTTTTC TTCAGCCATT 1260 

CCAGCATAGA GAACAAAACC TTATGGAAAC AGGAATGTCA ATTGTGTAAT CATTGTTCTA 1320 

ATTAGGTAAA TAGAAGTCCT TATGTATGTG TTACAAGAAT TTCCCCCACA ACATCCTTTA 1380 

TGACTGAAGT TCAATGACAG TTTGTGTTTG GTGGTAAAGG ATTTTCTCCA TGGCCTGAAT 1440 

TAAGACCATT AGAAAGCACC AGGCCGTGGG AGCAGTGACC ATCTACTGAC TGTTCTTGTG 1500 

GATCTTGTGT CCAGGGACAT GGGGTGACAT GCCTCGTATG TGTTAGAGGG TGGAATGGAT 1560 

GTGTTTGGCG CTGCATGGGA TCTGGTGCCC CTCTTCTCCT GGATTCACAT CCCCACCCAG 1620 

GGCCCGCTTT TACTAAGTGT TCTGCCCTAG ATTGGTTCAA GGAGGTCATC CAACTGACTT 1680 

■ TATCAAGTGG AATTGGGATA TATTTGATAT ACTTCTGCCT AACAACATGG AAAAGGGTTT 1740 

TCTTTTCCCT GCAAGCTACA TCCTACTGCT TTGAACTTCC AAGTATGTCT AGTCACCTTT 1800 

TAAAATGTAA ACATTTTCAG AAAAATGAGG ATTGCCTTCC TTGTATGCGC TTTTTACCTT 1860 

GACTACCTGA ATTGCAAGGG ATTTTTATAT ATTCATATGT TACAAAGTCA GCAACTCTCC 1920 

TGTTGGTTCA TTATTGAATG TGCTGTAAAT TAAGTCGTTT GCAATTAAAA CAAGGTTTGC 1980 
CCACATCCAA AAAAAAAAAA AAAAA 

Seq ID NO: 381 Protein sequence 
Protein Accession #: CAB66876 

1 11 21 31 41 51 

| | I 1 I I 

MKMVAPWTRF YSNSCCLCCH VRTGTILLGV WYLIINAWL LILLSALADP DQYNFSSSEL 60 

GGDPEFMDDA NMCIAIAISL LMILICAMAT YGAYKQRAAW IIPPFCYQIF DFALNMLVAI 120 

TVLIYPNSIQ EYIRQLPPNF PYRDDVMSVN PTCLVLIILL FISIILTFKG YLISCVWNCY 180 

RYINGRNSSD VLVYVTSNDT TVLLPPYDDA TVNGAAKEPP PPYVSA 

Seq ID NOt 382 DNA sequence 
Nucleic Acid Accession &: NM_002510 
Coding sequence: 92-1774 

1 11 21 31 41 51 

I I I I I I 

CAGATGCCAG AAGAACACTG TTGCTCTTGG TGGACGGGCC CAGAGGAATT CAGAGTTAAA 60 

CCTTGAGTGC CTGCGTCCGT GAGAATTCAG CATGGAATGT CTCTACTATT TCCTGGGATT 120 

TCTGCTCCTG GCTGCAAGAT TGCCACTTGA TGCCGCCAAA CGATTTCATG ATGTGCTGGG 180 

CAATGAAAGA CCTTCTGCTT ACATGAGGGA GCACAATCAA TTAAATGGCT GGTCTTCTGA 240 

TGAAAATGAC TGGAATGAAA AACTCTACCC AGTGTGGAAG CGGGGAGACA TGAGGTGGAA 300 

AAACTCCTGG AAGGGAGGCC GTGTGCAGGC GGTCCTGACC AGTGACTCAC CAGCCCTCGT 360 



327 



WO 02/086443 

GGGCTCAAAT ATAACATCTG CGGTGAACCT GATATTCCCT AQATGCCAAA AGGAAGATGC 420 

CAATGGCAAC ATAGTCTATG AGAAGAACTG CAGAAATGAG GCTGGTTTAT CTGCTGATCC 480 

ATATGTTTAC AACTGGACAG CATGGTCAGA GGACAGTGAC GGGGAAAATG GCACCGGCCA 540 

AAGCCATCAT AACGTCTTCC CTGATGGGAA ACCTTTTCCT CACCACCCCG GATGGAGAAG 600 

ATGGAATTTC ATCTACGTCT TCCACACACT TGGTCAGTAT TTCCAGAAAT TGGGACGATG 660 

TTCAGTGAGA GTTTCTGTGA ACACAGCCAA TGTGACACTT GGGCCTCAAC TCATGGAAGT 720 

GACTGTCTAC AGAAGACATG GACGGGCATA TGTTCCCATC GCACAAGTGA AAGATGTGTA 780 

CGTGGTAACA GATCAGATTC CTGTGTTTGT GACTATGTTC CAGAAGAACG ATCGAAATTC 840 

ATCCGACGAA ACCTTCCTCA AAGATCTCCC CATTATGTTT GATGTCCTGA TTCATGATCC 900 

TAGCCACTTC CTCAATTATT CTACCATTAA CTACAAGTGG AGCTTCX3GGG ATAATACTGG 960 

CCTGTTTGTT TCCACCAATC ATACTGTGAA TCACACGTAT GTGCTCAATG GAACCTTCAG 1020 

CCTTAACCTC ACTGTGAAAG CTGCAGCACC AGGACCTTGT CCGCCACCGC CACCACCACC 1080 

CAGACCTTCA AAACCCACCC CTTCTTTAGG ACCTGCTGGT GACAACCCCC TGGAGCTGAG 1140 

TAGGATTCCT GATGAAAACT GCCAGATTAA CAGATATGGC CACTTTCAAG CCACCATCAC 1200 

AATTGTAGAG GGAATCTTAG AGGTTAACAT CATCCAGATG ACAGACGTCC TGATGCCGGT 1260 

GCCATGGCCT GAAAGCTCCC TAATAGACTT TGTCGTGACC TGCCAAGGGA GCATTCCCAC 1320 

GGAGGTCTGT ACCATCATTT CTGACCCCAC CTGCGAGATC ACCCAGAACA CAGTCTGCAG 1380 

CCCTGTGGAT GTGGATGAGA TGTGTCTGCT GACTGTGAGA CGAACCTTCA ATGGGTCTGG 1440 

GACGTACTGT GTGAACCTCA CCCTGGGGGA TGACACAAGC CTGGCTCTCA CGAGCACCCT 1500 

GATTTCTGTT CCTGACAGAG ACCCAGCCTC GCCTTTAAGG ATGGCAAACA GTGCCCTGAT 1560 

CTCCGTTGGC TGCTTGGCCA TATTTGTCAC TGTGATCTCC CTCTTGGTGT ACAAAAAACA 1620 

CAAGGAATAC AACCCAATAG AAAATAGTCC TGGGAATGTG GTCAGAAGCA AAGGCCTGAG 1680 

TGTCTTTCTC AACCGTGCAA AAGCCGTGTT CTTCCCGGGA AACCAGGAAA AGGATCCGCT 1740 

ACTCAAAAAC CAAGAATTTA AAGGAGTTTC TTAAATTTCG ACCTTGTTTC TGAAGCTCAC 1800 

TTTTCAGTGC CATTGATGTG AGATGTGCTG GAGTGGCTAT TAACCTTTTT TTCCTAAAGA I860 

TTATTGTTAA ATAGATATTG TGGTTTGGGG AAGTTGAATT TTTTATAGGT TAAATGTCAT 1920 

TTTAGAGATG GGGAGAGGGA TTATACTGCA GGCAGCTTCA GCCATGTTGT GAAACTGATA 1980 

AAAGCAACTT AGCAAGGCTT CTTTTCATTA TTTTTTATGT TTCACTTATA AAGTCTTAGG 2040 

TAACTAGTAG GATAGAAACA CTGTGTCCCG AGAGTAAGGA GAGAAGCTAC TATTGATTAG 2100 

AGCCTAACCC AGGTTAACTG CAAGAAGAGG CGGGATACTT TCAGCTTTCC ATGTAACTGT 2160 

ATGCATAAAG CCAATGTAGT CCAGTTTCTA AGATCATGTT CCAAGCTAAC TGAATCCCAC 2220 

TTCAATACAC ACTCATGAAC TCCTGATGGA ACAATAACAG GCCCAAGCCT GTGGTATGAT 2280 

GTGCACACTT GCTAGACTCA GAAAAAATAC TACTCTCATA AATGGGTGGG AGTATTTTGG 2340 

TGACAACCTA CTTTGCTTGG CTGAGTGAAG GAATGATATT CATATATTCA TTTATTCCAT 2400 

GGACATTTAG TTAGTGCTTT TTATATACCA GGCATGATGC TGAGTGACAC TCTTGTGTAT 2460 

ATTTCCAAAT TTTTGTATAG TCGCTGCACA TATTTGAAAT CATATATTAA GACTTTCCAA 2520 

AGATGAGGTC CCTGGTTTTT CATGGCAACT TGATCAGTAA GGATTTCACC TCTGTTTGTA 2580 

ACTAAAACCA TCTACTATAT GTTAGACATG ACATTCTTTT TCTCTCCTTC CTGAAAAATA 2640 
AAGTGTGGGA AGAGACAAAA AAAAAAAAA 



Seq ID NO: 383 Protein sequence 
Protein Accession NP_002501 

1 11 21 31 41 51 

I | I I I I 

MECLYYFLGF LUiAARLPLD AAKRFHDVLG NERPSAYMRE HNQLNGWSSD ENDWNEKLYP 60 

VWKRGDMRWK NSWKGGRVQA VLTSDSPALV GSNITFAVNL IFPRCQKEDA NGNIVYEKNC 120 

RNEAGLSADP YVYNWTAWSE DSDGENGTGQ SHHNVFPDGK PFPHHPGWRR WNFIYVFHTL 180 

GQYFQKLGRC SVRVSVNTAN VTLGPQLMEV TVYRRHGRAY VPIAQVKDVY WTDQIPVFV 240 

TMFQKNDRNS SDETFLKDLP IMFDVLIHDP SHFLNYSTIN YKWSFGDNTG LFVSTNHTVN 300 

HTYVLNGTFS IjNLTVKAAAP GPCPPPPPPP RPSKPTPSLG PAGDNPLELS RIPDENCQIN 360 

RYGHFQATIT IVEGILEVNI IQMTDVtiMPV PWPESSLIDF WTCQGSIPT EVCTIISDPT 420 

CEITQNTVCS PVDVDEMCLL TVRRTFNGSG TYCVNLTLGD DTSLALTSTL ISVPDRDPAS 480 

PLRMANSALI SVGCLAIFVT VISLLVYKKH KEYNPIENSP GNWRSKGLS VFLNRAKAVF 540 
FPGNQEKDPL LKNQEFKGVS 

Seq ID NO: 384 DNA sequence 
Nucleic Acid Accession #: NM_001134 
Codling sequence: 48-1877 

1 11 21 31 41 51 

I I I I I I 

TCCATATTGT GCTTCCACCA CTGCCAATAA CAAAATAACT AGCAACCATG AAGTGGGTGG 60 

AATCAATTTT TTTAATTTTC CTACTAAATT TTACTGAATC CAGAACACTG CATAGAAATG . 120 

AATATGGAAT AGCTTCCATA TTGGATTCTT ACCAATGTAC TGCAGAGATA AGTTTAGCTG 180 

ACCTGGCTAC CATATTTTTT GCCCAGTTTG TTCAAGAAGC CACTTACAAG GAAGTAAGCA 240 

AAATGGTGAA AGATGCATTG ACTGCAATTG AGAAACCCAC TGGAGATGAA CAGTCTTCAG 3 00 

GGTGTTTAGA AAACCAGCTA CCTGCCTTTC TGGAAGAACT TTGCCATGAG AAAGAAATTT 3 60 

TGGAGAAGTA CGGACATTCA GACTGCTGCA GCCAAAGTGA AGAGGGAAGA CATAACTGTT 420 

TTCTTGCACA CAAAAAGCCC ACTCCAGCAT CGATCCCACT TTTCCAAGTT CCAGAACCTG 480 

TCACAAGCTG TGAAGCATAT GAAGAAGACA GGGAGACATT CATGAACAAA TTCATTTATG 540 

AGATAGCAAG AAGGCATCCC TTCCTGTATG CACCTACAAT TCTTCTTTGG GCTGCTCGCT 600 

ATGACAAAAT AATTCCATCT TGCTGCAAAG CTGAAAATGC AGTTGAATGC TTCCAAACAA 660 

AGGCAGCAAC AGTTACAAAA GAATTAAGAG AAAGCAGCTT GTTAAATCAA CATGCATGTG 720 

CAGTAATGAA AAATTTTGGG ACCCGAACTT TCCAAGCCAT AACTGTTACT AAACTGAGTC 780 

AGAAGTTTAC CAAAGTTAAT TTTACTGAAA TCCAGAAACT AGTCCTGGAT GTGGCCCATG 840 

TACATGAGCA CTGTTGCAGA GGAGATGTGC TGGATTGTCT GCAGGATGGG GAAAAAATCA 900 

TGTCCTACAT ATGTTCTCAA CAAGACACTC TGTCAAACAA AATAACAGAA TGCTGCAAAC 960 

TGACCACGCT GGAACGTGGT CAATGTATAA TTCATGCAGA AAATGATGAA AAACCTGAAG 1020 

GTCTATCTCC AAATCTAAAC AGGTTTTTAG GAGATAGAGA TTTTAACCAA TTTTCTTCAG 1080 

GGGAAAAAAA TATCTTCTTG GCAAGTTTTG TTCATGAATA TTCAAGAAGA CATCCTCAGC 1140 

TTGCTGTCTC AGTAATTCTA AGAGTTGCTA AAGGATACCA GGAGTTATTG GAGAAGTGTT 1200 

TCCAGACTGA AAACCCTCTT GAATGCCAAG ATAAAGGAGA AGAAGAATTA CAGAAATACA 1260 

TCCAGGAGAG CCAAGCATTG GCAAAGCGAA GCTGCGGCCT CTTCCAGAAA CTAGGAGAAT 1320 

ATTACTTACA AAATGCGTTT CTCGTTGCTT ACACAAAGAA AGCCCCCCAG CTGACCTCGT 1380 

CGGAGCTGAT GGCCATCACC AGAAAAATGG CAGCCACAGC AGCCACTTGT TGCCAACTCA 1440 

GTGAGGACAA ACTATTGGCC TGTGGCGAGG GAGCGGCTGA CATTATTATC GGACACTTAT 1500 
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GTATCAGACA TGAAATGACT CCAGTAAACC CTGGTGTTGG 
ATGCCAACAG GAGGCCATGC TTCAGCAGCT TGGTGGTGGA 
CATTCTCTGA TGACAAGTTC ATTTTCCATA AGGATCTGTG 
TGCAAACGAT GAAGCAAGAG TTTCTCATTA ACCTTGTGAA 
AGGAACAACT TGAGGCTGTC ATTGCAGATT TCTCAGGCCT 
GCCAGGAACA GGAAGTCTGC TTTGCTGAAG AGGGACAAAA 
CTGCTTTGGG AGTTTAAATT ACTTCAGGGG AAGAQAAGAC 
TGTGAACTTT TCTCTTTAAT TTTAACTGAT TTAACACTTT 
AAAGACTTTT ATGTGAGATT TCCTTATCAC AGAAATAAAA 

Seq ID NO: 385 Protein sequence 
Protein Accession #: NF_001125 



PCT7US02/12476 



MKWVESIFLI 
KEVSKMVKDA 
RHNCFLAHKK 
WAARYDKIIP 
TKLSQKFTKV 
ECCKLTTLER 
RHPQLAVSVI 
KLGEYYLQNA 
IGHLCIRHEM 
QGVALQTMKQ 
SKTRAALGV 



11 
I 

FLLNFTESRT 
LTAIEKPTGD- 
PTPASIPLFQ 
SCCKAENAVE 
NFTEIQKLVL 
GQCIIHAEND 
LRVAKGYQEL 
FLVAYTKKAP 
TPVNPGVGQC 
EFLINLVKQK 



21 

I 

LHRNEYGIAS 
EQSSGCLENQ 
VPEPVTSCEA 
CFQTKAATVT 
DVAHVHEHCC 
EKPEGLSPNL 
LEKCFQTENP 
QLTSSELMAI 
CTSSYANRRP 
PQITEEQLEA 



31 
I 

ILDSYQCTAE 
LPAFLEELCH 
YEEDRETFMN 
KELRESSLLN 
RGDVLDCLQD 
NRFLGDRDFN 
LECQDKGEEE 
TRKMAATAAT 
CFSSLWDET 
VIADFSGLLE 



CCAGTGCTGC 
TGAAACATAT 
CCAAGCTCAG 
GCAAAAGCCA 
GTTGGAGAAA 
ACTGATTTCA 
AAAACGAGTC 
TTGTGAATTA 
TATCTCCAAA 



41 
I 

ISLADLATIF 
EKEILEKYGH 
KFIYEIARRH 
OHACAVMKNF 
GEKIMSYICS 
QFSSGEKNIF 
LQKYIQESQA 
CCQLSEDKLL 
YVPPAFSDDK 
KCCQGQEQEV 



ACTTCTTCAT 
GTCCCTCCTG 
GGTGTAGCGC 
CAAATAACAG 
TGCTGCCAAG 
AAAACTCGTG 
TTTCATTCGG 
ATGAAATGAT 
TG 



51 

I 

FAQFVQEATY 



1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 



PFLYAPTILL 
GTRTFQAITV 
QQDTLSNKIT 
LA5FVHEYSR 
LAKRSCGLFQ 
ACGEGAADII 
FIFHKDLCQA 
CFAEEGQKLI 



Seq ID NO: 386 DNA sequence 

Nucleic Acid Accession ft: NM_002205.1 

Coding sequence : 1 . . 3149 



1 
I 

ATGGGGAGCC 
CGCCGACCCC 
GGCTTCAACT 
GGATTCTCAG 
CCCAAGGCTA 
TGGGGTGCCA 
CTGGAGTCCT 
TGGTTCGGGG 
AGCTGGCGCA 
GATAACTTCA 
GGACAGGGTT 
TTAGGTGGAC 
ATTGCAGAAT 
CGCCAGGCCA 
TTCAGTGGTG 
GGCTATGTCA 
CAGATGGCCT 
GATGACTTGC 
GAGGTGGGCA 
CTTACCCTCA 
GACCTGGACC 
CAGCAGGGAG 
CAGGTTCTGC 
CGAGGAGGCC 
GTGGACAAGG 
ATCTTCCCCG 
GCCTGCATCA 
GGTTTCACAG 
CTGTTCCTGG 
CGAGAGGATT 
CTCTCGCCGA 
CACGGCCTCA 
ATCTTGCTGG 
GGGGAGCAGA 
CAGAATGTGG 
GCTGAGTACT 
TTTGCCGTGA 
GCCAGTCTGT 
ATCCAGTTTG 
TCCTTTCGGC 
GAGGCAGTGC 
GACCTGGGAC 
AGCCAGGGTG 
GTGACCAGAG 
GAGTTGGATC 
TCTGCTTCCT 
TGTGAGCTCG 
TGGGCCAAGA 
TACAAAGCCC 
CAGGTGGCCA 
ATCATCATCC 
TACAAGCTTG 
CTCAAGCCTC 



11 

I . 

GGACGCCAGA 
CGCTSSTGCC 
TAGACGCGGA 
TGGAGTTTTA 
ATACCAGCCA 
GCCCCACACA 
CACTGTCCAG 
CAACAGTTCG 
CAGAGAAGGA 
CCCGAATTCT 
ACTGCCAAGG 
CAGGAAGCTA 
CTTATTACCC 
GTTCCATCTA 
ATGACACAGA 
CCATCCTTAA 
CCTACTTTGG 
TGGTGGGGGC 
GGGTCTACGT 
CTGGCCATGA 
AGGATGGCTA 
TAGTGTTTGT 
AGCCCCTGTG 
GAGACCTGGA 
CTGTGGTATA 
CCATGTTCAA 
ACCTTAGCTT 
TGGAACTTCA 
CCTCCAGGCA 
GCAGAGAGAT 
TTCACATCGC 
GGCCAGCCCT 
ACTGTGGAGA 
ACCATGTGTA 
GTGAGGGTGG 
CAGGACTCGT 
ACCAGAGCCG 
GGGGTGGCCT 
ACTTCCAGAT 
TCTCCGTGGA 
TATTCCCAGT 
CTGCTGTCCA 
TGCTGGAACT 
TTACGGGACT 
CCGAGGGTTC 
CGGGACCTCA 
GGCCCCTGCA 
CTTTCTTGCA 
TGAAGATGCC 
CAGCTGTGCA 
TAGCCATCCT 
GATTCTTCAA 
CAGCCACCTC 



21 

I 

GTCCCCTCTC 
GCTGCTGTTG 
GGCCCCAGCA 
CCGGCCGGGA 
GCCAGGAGTG 
GTGCACCCCC 
CTCAGAGGGA 
AGCCCATGGC 
GCCACTGAGC 
GGAGTATGCA 
AGGCTTCAGT 
TTTCTGGCAA 
CGAGTACCTG 
TGATGACAGC 
AGACTTTGTT 
TGGCTCAGAC 
CTATGCAGTG 
ACCCCTGCTC 
CTACCTGCAG 
TGAGTTTGGC 
CAATGATGTG 
ATTTCCTGGG 
GGCAGCCAGC 
TGGCAATGGA 
CAGGGGCCGC 
■ CCCAGAGGAG 
CTGCCTCAAT 
GCTGGACTGG 
GGCAACCCTG 
GAAGATCTAC 
TCTCAACTTC 
ACATTATCAG 
AGACAACATC 
CCTGGGTGAC 
CGCCTATGAG 
CAGACACCCA 
CCTGCTGGTG 
TCGGTTTACA 
CCTCAGCAAG 
GGCTCAGGCC 
AAGCGACTGG 
CCATGTCTAT 
CAGCTGTCCC 
CAACTGCACC 
CCTGCACCAC 
GATCCTGAAA 
CCAACAAGAG 
GCGGGAGCAC 
CTACCGAATC 
ATGGACCAAG 
GTTTGGCCTC 
ACGCTCCCTC 
TGATGCCTGA 



31 
I 

CACGCCGTGC 
CTGCTSSTGC 
GTACTCTCGG 
ACAGACGGGG 
CTGCAGGGTG 
ATTGAATTTG 
GAGGAGCCTG 
TCCTCCATCT 
GACCCCGTGG 
CCCTGCCGCT 
GCCGAGTTCA 
GGCCAGATCC 
ATCAACCTGG 
TACCTAGGAT 
GCTGGTGTGC 
ATTCGATCCC 
GCCGCCACAG 
ATGGATCGGA 
CACCCAGCCG 
CGATTTGGCA 
GCCATCGGGG 
GGCCCAGGAG 
CACACCCCAG 
TATCCTGATC 
CCCATCGTGT 
CGGAGCTGCA 
GCTTCTGGAA 
CAGAAGCAGA 
ACCCAGACCC 
CTCAGGAACG 
TCCTTGGACC 
AGCAAGAGCC 
TGTGTGCCTG 
AAGAATGCCC 
GCTGAGCTTC 
GGGAACTTCT 
TGTGACCTGG 
GTCCCTCATC 
AATCTCAACA 
CAGGTCACCC 
CATCCCCGAG 
GAGCTCATCA 
CAGGCTCTGG 
ACCAATCACC 
CAGCAAAAAC 
TGCCCGGAGG 
AGCCAAAGTC 
CAGCCATTTA 
CTGCCTCGGC 
GCAGAAGGCA 
CTGCTCCTAG 
CCATATGGCA 



41 

I 

AGCTGCGCTG 
CGCCGCCACC 
GGCCCCCGGG 
TCAGTGTGCT 
GTGCTGTCTA 
ACAGCAAAGG 
TGGAGTACAA 
TGGCATGCGC 
GCACCTGCTA 
CAGATTTCAG 
CCAAGACTGG 
TGTCTGCCAC 
TTCAGGGGCA 
ACTCTGTGGC 
CCAAAGGGAA 
TCTACAACTT 
ACGTCAATGG 
CCCCTGACGG 
GCATAGAGCC 
GCTCCTTGAC 
CTCCCTTTGG 
GGCTGGGCTC 
ACTTCTTTGG 
TGATTGTGGG 
CCGCTAGTGC 
GCTTAGAGGG 
AACACGTTGC 
AGGGAGGGGT 
TGCTCATCCA 
AGTCAGAATT 
CCCAAGCCCC 
GGATAGAGGA 
ACCTGCAGCT 
TGAACCTCAC 
GGGTCACCGC 
CCAGCCTGAG 
GCAACCCCAT 
TCCGGGACAC 
ACTCGCAAAG 
TGAACGGTGT 
ACCAGCCTCA 
ACCAAGGCCC 
AAGGTCAGCA 
CCATTAACCC 
GGGAAGCTCC 
CTGAGTGTTT 
TGCAGTTGCA 
GCCTGCAGTG 
AGCTGCCCCA 
GCTATGGCGT 
GTCTACTCAT 
CCGCCATGGA 



51 

I 

GGGCCCCCGG 
CAGGGTCGGG 
CTCCTTCTTC 
GGTGGGAGCA 
CCTCTGTCCT 
CTCTCGGCTC 
GTCCTTGCAG 
TCCACTGTAC 
CCTCTCCACA 
CTGGGCAGCA 
CCGTGTGGTT 
TCAGGAGCAG 
GCTGCAGACT 
TGTTGGTGAA 
CCTCACTTAC 
CTCAGGGGAA 
GGACGGGCTG 
GCGGCCTCAG 
CACGCCCACC 
CCCCCTGGGG 
TGGGGAGACC 
TAAGCCTTCC 
CTCTGCCCTT 
GTCCTTTGGT 
CTCCCTCACC 
GAACCCTGTG 
TGACTCCATT 
ACGGCGGGCA 
GAATGGGGCT 
TCGAGACAAA 
AGTGGACAGC 
CAAGGCTCAG 
GGAAGTGTTT 
TTTCCATGCC 
CCCTCCAGAG 
CTGTGACTAC 
GAAGGCAGGA 
TAAGAAAACC 
CGACGTGGTT 
CTCCAAGCCT 
GAAGGAGGAG 
CAGCTCCATT 
GCTCCTATAT 
AAAGGGCCTG 
AAGCCGCAGC 
CAGGCTGCGC 
TTTCCGAGTC 
TGAGGCTGTG 
AAAAGAGCGT 
CCCACTGTGG 
CTACATCCTC 
AAAAGCTCAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
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Seq ID NO j 387 Protein sequence 
Protein Accession #x NP_002 196.1 

5 1 11' 21 31 41 51 

I I I I I I 

MGSRTPESPL HAVQLRWGPR RRPPLLPLLL LLLPPPPRVG GFNLDAEAPA VLSGPPGSFF 60 

GFSVEPYRPG TDGVSVLVGA PKANTSQPGV LQGGAVYLCP WGASPTQCTP IEFDSKGSRL 120 

LESSLSSSEG EEPVEYKSLQ WFGATVRAHG SSILACAPLY SWRTEKEPLS DPVGTCYLST 180 

10 DNFTRILEYA PCRSDPSWAA GQGYCQGGPS AEFTKTGRW LGGPGSYFWQ GQILSATQEQ 240 

IAESYYPEYL 'INLVQGQLQT RQASSIYDDS YLGYSVAVGE PSGDDTEDPV AGVPKGNLTY 300 

GYVTILNGSD IRSLYNFSGE QMASYFGYAV AATDVNGDGL DDLLVGAPLL MDRTPDGRPQ 360 

EVGRVYVYLQ HPAGIEPTPT LTLTGHDEFG RFGSSLTPLG DLDQDGYNDV AIGAPFGGET 420 

QQGWFVFPG GPGGLGSKPS QVLQPLWAAS HTPDFFGSAL RGGRDLDGNG YPDLIVGSFG 480 

15 VDKAWYRGR PIVSASASLT I FPAMFNPEE RSCSLEGNPV ACINLSFCLN ASGKHVADSI 540 

GFTVELQLDW QKQKGGVRRA LFLASRQATIj TQTLLIQNGA REDCREMKIY LRNESEFRDK 600 

LSPIHIALNF SLDPQAPVDS HGLRPALHYQ SKSRIEDKAQ ILLDCGEDNI CVPDI/3LEVF. 660 

GEQNHVYLGD KNALNLTFHA QNVGEGGAYE AELRVTAPPE AEYSGLVRHP GNFSSLSCDY 720 

FAVNQSRLLV CDLGNPMKAG ASLWGGLRFT VPHLRDTKKT IQFDFQILSK NLNNSQSDW 780 

20 SFRLSVEAQA QVTLNGVSKP EAVliFPVSDW HPRDQPQKEE DLGPAVHHVY ELINQGPSSI 840 

SQGVLELSCP QALEGQQLLY VTRVTGLNCT TNHPINPKGL ELDPEGSLHH QQKREAPSRS 900 

SASSGPQILK CPEAECFRLR CELGPLHQQE SQSLQLHFRV WAKTFLQREH QPFSLQCEAV 960 

YKALKMPYRI LPRQLPQKER QVATAVQWTK AEGSYGVPLW IIILAILFGL LLLGLLlYIL 1020 
YKLGFFKRSL PYGTAMEKAQ LKPPATSDA 



25 



30 



65 



80 



Seq' ID NO: 388 DNA sequence 

Nucleic Acid Accession ft: NM_002425 

Coding sequence; 2 6.. 1453 



1 11 21 31 41 51 

I I I I I I 

AAAGAAGGTA AGGGCAGTGA GAATGATGCA TCTTGCATTC CTTGTGCTGT TGTGTCTGCC 60 

AGTCTGCTCT GCCTATCCTC TGAGTGGGGC AGCAAAAGAG GAGGACTCCA ACAAGGATCT 120 

TGCCCAGCAA TACCTAGAAA AGTACTACAA CCTCGAAAAG GATGTGAAAC AGTTTAGAAG 180 

35 AAAGGACAGT AATCTCATTG TTAAAAAAAT CCAAGGAATG CAGAAGTTCC TTGGGTTGGA 240 

GGTGACAGGG AAGCTAGACA CTGACACTCT GGAGGTGATG CGCAAGCCCA GGTGTGGAGT 300 

TCCTGACGTT GGTCACTTCA GCTCCTTTCC TGGCATGCCG AAGTGGAGGA AAACCCACCT 360 

TACATACAGG ATTGTGAATT ATACACCAGA TTTGCCAAGA GATGCTGTTG ATTCTGCCAT 420 

TGAGAAAGCT CTGAAAGTCT GGGAAGAGGT GACTCCACTC ACATTCTCCA GGCTGTATGA 480 

40 AGGAGAGGCT GATATAATGA TCTCTTTCGC AGTTAAAGAA CATGGAGACT TTTACTCTTT 540 

TGATGGCCCA GGACACAGTT TGGCTCATGC CTACCCACCT GGACCTGGGC TTTATGGAGA 600 

TATTCACTTT GATGATGATG AAAAATGGAC AGAAGATGCA TCAGGCACCA ATTTATTCCT 660 

CGTTGCTGCT CATGAACTTG GCCACTCCCT GGGGCTCTTT CACTCAGCCA ACACTGAAGC 720 

TTTGATGTAC CCACTCTACA ACTCATTCAC AGAGCTCGCC CAGTTCCGCC TTTCGCAAGA 780 

45 TGATGTGAAT GGCATTCAGT CTCTCTACGG ACCTCCCCCT GCCTCTACTG AGGAACCCCT 840 

GGTGCCCACA AAATCTGTTC CTTCGGGATC TGAGATGCCA GCCAAGTGTG ATCCTGCTTT 900 

GTCCTTCGAT GCCATCAGCA CTCTGAGGGG AGAATATCTG TTCTTTAAAG ACAGATATTT 960 

TTGGCGAAGA TCCCACTGGA ACCCTGAACC TGAATTTCAT TTGATTTCTG CATTTTGGCC 1020 

CTCTCTTCCA TCATATTTGG ATGCTGCATA TGAAGTTAAC AGCAGGGACA CCGTTTTTAT 1080 

50 TTTTAAAGGA AATGAGTTCT GGGCCATCAG AGGAAATGAG GTACAAGCAG GTTATCCAAG 1140 

AGGCATCCAT ACCCTGGGTT TTCCTCCAAC CATAAGGAAA ATTGATGCAG CTGTTTCTGA 1200 

CAAGGAAAAG AAGAAAACAT ACTTCTTTGC AGCGGACAAA TACTGGAGAT TTGATGAAAA 1260 

TAGCCAGTCC ATGGAGCAAG GCTTCCCTAG ACTAATAGCT GATGACTTTC CAGGAGTTGA 1320 

GCCTAAGGTT GATGCTGTAT TACAGGCATT TGGATTTTTC TACTTCTTCA GTGGATCATC 1380 

55 ACAGTTTGAG TTTGACCCCA ATGCCAGGAT GGTGACACAC ATATTAAAGA GTAACAGCTG 1440 

GTTACATTGC TAGGCGAGAT AGGGGGAAGA CAGATATGGG TGTTTTTAAT AAATCTAATA 1500 

ATTATTCATC TAATGTATTA TGAGCCAAAA TGGTTAATTT TTCCTGCATG TTCTGTGACT 1560 

GAAGAAGATG AGCCTTGCAG ATATCTGCAT GTGTCATGAA GAATGTTTCT GGAATTCTTC 1620 

ACTTGCTTTT GAATTGCACT GAACAGAATT AAGAAATACT CATGTGCAAT AGGTGAGAGA 1680 

60 ATGTATTTTC ATAGATGTGT TATTACTTCC TCAATAAAAA GTTTTATTTT GGGCCTGTTC 1740 
CTT 



Seq ID NO: 389 Protein sequence 
Protein Accession #: NP_002416 



1 11 21 31 41 51 

I I I I I I 

MHLAFLVLLC LPVCSAYPLS GAAKEEDSNK DLAQQYLEKY YNLEKDVKQF RRKDSNLIVK 60 

KIQGMQKFLG LEVTGKLDTD TLEVMRKPRC GVPDVGHFSS FPGMPKWRKT HLTYRIVNYT 120 

70 PDLPRDAVDS AIEKALKVWE EVTPIiTFSRL YEGEADIMIS FAVKEHGDFY SFDGPGHSLA 180 

HAYPPGPGLY GDIHFDDDEK WTEDASGTNL FLVAAHELGH SLGLFHSANT EALMYPIiYNS 240 

PTELAQFRLS QDDVNGIQSL YGPPPASTEE PLVPTKSVPS GSEMPAKCDP ALSFDAISTL 300 

RGEYLFFKDR YFWRRSHWNP EPEFHLISAF WPSLPSYLDA AYEVNSRDTV FIPKGNEFWA 360 

IRGNEVQAGY PRGIHTLGFP PTIRKIDAAV SDKEKKKTYF FAADKYWRFD ENSQSMEQGF 420 
75 PRLIADDPPG VEPKVDAVLQ APGPPYFPSG SSQFEFDPNA RMVTHILKSN SWLHC 



Seq ID NO: 390 DNA sequence 

Nucleic Acid Accession #t NM_002421.2 

Coding sequence: 1..1409 



1 11 21 31 41 51 

I I I I I I 

ATGCACAGCT TTCCTCCACT GCTGCTGCTG CTGTTCTGGG GTGTGGTGTC ACACAGCTTC 60 

CCAGCGACTC TAGAAACACA AGAGCAAGAT GTGGACTTAG TCCAGAAATA CCTGGAAAAA 120 

85 TACTACAACC TGAAGAATGA TGGGAGGCAA GTTGAAAAGC GGAGAAATAG TGGCCCAGTG 180 

GTTGAAAAAT TGAAGCAAAT GCAGGAATTC TTTGGGCTGA AAGTGACTGG GAAACCAGAT 240 

GCTGAAACCC TGAAGGTGAT GAAGCAGCCC AGATGTGGAG TGCCTGATGT GGCTCAGTTT 300 
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GTCCTCACTG AGGGGAACCC TCGCTGGGAG CAAACACATC 
TACAOGCCAG ATTTGCCAAG AGCAGATGTG GACCATGCCA 
TGGAGTAATG TCACACCTCT GACATTCACC AAGGTCTCTG 
ATATCTTTTG TCAGGGGAGA TCATCGGGAC AACTCTCCTT 
CTTGCTCATG CTTTTCAACC AGGCCCAGGT ATTGGAGGGG 
GAAAGGTGGA CCAACAATTT CAGAGAGTAC AACTTACATC 
GGCCATTCTC TTGQACTCTC CCATTCTACT GATATCGGGG 
ACCTTCAGTG GTGATGTTCA GCTAGCTCAG GATGACATTG 
GGACGTTCCC AAAATCCTGT CCAGCCCATC GGCCCACAAA 
AAGCTAACCT TTGATGCTAT AACTACGATT CGGGGAGAAG 
TTCTACATGC GCACAAATCC CTTCTACCCG GAAGTTGAGC 
TGGCCACAAC TGCCAAATGG GCTTGAAGCT GCTTACGAAT 
CGGTTTTTCA AAGGGAATAA GTACTGGGCT GTTCAGGGAC 
CCCAAGGACA TCTACAGCTC CTTTGGCTTC CCTAGAACTG 
CTTTCTGAGG AAAACACTGG AAAAACCTAC TTCTTTGTTG 
GATGAATATA AACGATCTAT GGATCCAGGT TATCCCAAAA 
GGAATTGGCC ACAAAGTTGA TGCAGTTTTC ATGAAAGATG 
GGAACAAGAC AATACAAATT TGATCCTAAA ACGAAGAGAA 
AATAGCTGGT TCAACTGCAG GAAAAATTAG 

Seq ID NO: 391 Protein sequence 
Protein Accession #: NP_002412.1 



PCTAJS02/12476 



i 

i 

MHSFPPLLLL 
VEKLKQMQEF 
YTPDLPRADV 
LAHAFQPGPG 
TFSGDVQLAQ 
FYMRTNPFYP 
PKDIYSSFGF 
GIGHKVDAVF 



11 

I 

LFWGWSHSF 
FGLKVTGKPD 
DHAIEKAFQL 
IGGDAHFDED 
DDIDGIQAIY 
EVELNFISVF 
PRTVKHIDAA 
MKDGFFYFFH 



21 
I 

PATLETQEQD 
AETLKVMKQP 
WSNVTPLTFT 
ERWTNKFREY 
GRSQNPVQPI 
WPQLPNGLEA 
LSEENTGKTY 
GTRQYKFDPK 



31 

I 

VDLVQKYLEK 
RCGVPDVAQF 
KVSEGQADIM 
NLHRVAAHEL 
GPQTPKACDS 
AYEFADRDEV 
FFVANKYWRY 
TKRILTLQKA 



Seq ID NO: 392 DNA sequence 

Nucleic Acid Accession #: NM_002421.2 

Coding sequence : 1 . . 140 9 



1 

I 

ATGCACAGCT 
CCAGCGACTC 
TACTACAACC 
GTTGAAAAAT 
GCTGAAACCC 
GTCCTCACTG 
TACACGCCAG 
TGGAGTAATG 
ATATCTTTTG 
CTTGCTCATG 
GAAAGGTGGA 
GGCCATTCTC 
ACCTTCAGTG 
GGACGTTCCC 
AAGCTAACCT 
TTCTACATGC 
TGGCCACAAC 
CGGTTTTTCA 
CCCAAGGACA 
CTTTCTGAGG 
GATGAATATA 
GGAATTGGCC 
GGAACAAGAC 
AATAGCTGGT 



11 

I 

TTCCTCCACT 
TAGAAACACA 
TGAAGAATGA 
TGAAGCAAAT 
TGAAGGTGAT 
AGGGGAACCC 
ATTTGCCAAG 
TCACACCTCT 
TCAGGGGAGA 
CTTTTCAACC 
CCAACAATTT 
TTGGACTCTC 
GTGATGTTCA 
AAAATCCTGT 
TTGATGCTAT 
GCACAAATCC 
TGCCAAATGG 
AAGGGAATAA 
TCTACAGCTC 
AAAACACTGG 
AACGATCTAT 
ACAAAGTTGA 
AATACAAATT 
TCAACTGCAG 



21 
I 

GCTGCTGCTG 
AGAGCAAGAT 
TGGGAGGCAA 
GCAGGAATTC 
GAAGCAGCCC 
TCGCTGGGAG 
AGCAGATGTG 
GACATTCACC 
TCATCGGGAC 
AGGCCCAGGT 
CAGAGAGTAC 
CCATTCTACT 
GCTAGCTCAG 
CCAGCCCATC 
AACTACGATT 
CTTCTACCCG 
GCTTGAAGCT 
GTACTGGGCT 
CTTTGGCTTC 
AAAAACCTAC 
GGATCCAGGT 
TGCAGTTTTC 
TGATCCTAAA 
GAAAAATTAG 



31 
I 

CTGTTCTGGG 
GTGGACTTAG 
GTTGAAAAGC 
TTTGGGCTGA 
AGATGTGGAG 
CAAACACATC 
GACCATGCCA 
AAGGTCTCTG 
AACTCTCCTT 
ATTGGAGGGG 
AACTTACATC 
GATATCGGGG 
GATGACATTG 
GGCCCACAAA 
CGGGGAGAAG 
GAAGTTGAGC 
GCTTACGAAT 
GTTCAGGGAC 
CCTAGAACTG 
TTCTTTGTTG 
TATCCCAAAA 
ATGAAAGATG 
ACGAAGAGAA 



Seq ID NO: 393 Protein sequence 
Protein Accession #: NP_002412.1 



MHSFPPLLLL 
VEKLKQMQEF 
YTPDLPRADV 
LAHAFQPGPG 
TFSGDVQLAQ 
FYMRTNPFYP 
PKDIYSSFGF 
GIGHKVDAVF 



11 

I 

LFWGWSHSF 
FGLKVTGKPD 
DHAIEKAFQL 
IGGDAHFDED 
DDIDGIQAIY 
EVELNFISVF 
PRTVKHIDAA 
MKDGFFYFFH 



21 
I 

PATLETQEQD 
AETLKVMKQP 
WSNVTPLTFT 
ERWTNNFREY 
GRSQNPVQPI 
WPQLPNGLEA 
LSEENTGKTY 
GTRQYKFDPK 



31 

I 

VDLVQKYLEK 
RCGVPDVAQF 
KVSEGQADIM 
NLHRVAAHAL 
GPQTPKACDS 
AYEFADRDEV 
FFVANKYWRY 
TKRILTLQKA 



TGACCTACAG 


GATTGAAAAT 


360 


TTGAGAAAGC 


CTTCCAACTC 


420 


AGGGTCAAGC 


AGACATCATG 


480 


TTGATGGACC 


TGGAGGAAAT 


540 


ATGCTCATTT 


TGATGAAGAT 


600 


GTGTTGCGGC 


TCATGAACTC 


660 


CTTTGATGTA 


CCCTAGCTAC 


720 


ATGGCATCCA 


AGCCATATAT 


780 


CCCCAAAAGC 


ATGTGACAGT 


840 


TGATGTTCTT 


TAAAGACAGA 


900 


TCAATTTCAT 


TTCTGTTTTC 


960 


TTGCCGACAG 


AGATGAAGTC 


1020 


AGAATGTGCT 


ACACGGATAC 


1080 


TGAAGCATAT 


CGATGCTGCT 


1140 


CTAACAAATA 


CTGGAGGTAT 


1200 


TGATAGCACA 


TGACTTTCCT 


1260 


GATTTTTCTA 


TTTCTTTCAT 


1320 


TTTTGACTCT 


CCAGAAAGCT 


1380 


41 


51 




1 

YYNLKNDGRQ 


1 

VEKRRNSGPV 


60 


VLTEGNPRWE 


QTHLTYRIEN 


120 


ISFVRGDHRD 


NSPFDGPGGN 


180 


GHSLGLSHST 


DIGALMYPSY 


240 


KLTFDAITTI 


RGEVMFFKDR 


300 


RFFKGNKYWA 


VQGQNVLHGY 


360 


DEYKRSMDPG 


YPKMIAHDFP 


420 


NSWFNCRKN 






41 


51 




1 

GTGTGGTGTC 


1 

ACACAGCTTC 


60 


TCCAGAAATA 


CCTGGAAAAA 


120 


GGAGAAATAG 


TGGCCCAGTG 


180 


AAGTGACTGG 


GAAACCAGAT 


240 


TGCCTGATGT 


GGCTCAGTTT 


300 


TGACCTACAG 


GATTGAAAAT 


360 


TTGAGAAAGC 


CTTCCAACTC 


420 


AGGGTCAAGC 


AGACATCATG 


480 


TTGATGGACC 


TGGAGGAAAT 


540 


ATGCTCATTT 


TGATGAAGAT 


600 


GTGTTGCGGC 


TCATGCCCTC 


660 


CTTTGATGTA 


CCCTAGCTAC 


720 


ATGGCATCCA 


AGCCATATAT 


780 


CCCCAAAAGC 


ATGTGACAGT 


840 


TGATGTTCTT 


TAAAGACAGA 


900 


TCAATTTCAT 


TTCTGTTTTC 


960 


TTGCCGACAG 


AGATGAAGTC 


1020 


AGAATGTGCT 


ACACGGATAC 


1080 


TGAAGCATAT 


CGATGCTGCT 


1140 


CTAACAAATA 


CTGGAGGTAT 


1200 


TGATAGCACA 


TGACTTTCCT 


1260 


GATTTTTCTA 


TTTCTTTCAT 


1320 


TTTTGACTCT 


CCAGAAAGCT 


1380 


41 


51 




1 

YYNLKNDGRQ 


1 

VEKRRNSGPV 


60 


VLTEGNPRWE 


QTHLTYRIEN 


120 


ISFVRGDHRD 


NSPFDGPGGN 


180 


GHSLGLSHST 


DIGALMYPSY 


240 


KLTFDAITTI 


RGEVMFFKDR 


300 


RFFKGNKYWA 


VQGQNVLHGY 


360 


DEYKRSMDPG 


YPKMIAHDFP 


420 


NSWFNCRKN 







Seq ID NO: 394 DNA sequence 

Nucleic Acid Accession #: NM_014331.2 

Coding sequence: 1..1506 



11 



21 
I 



31 
I 



41 



51 
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ATGGTCAGAA AGCCTGTTGT GTCCACCATC TCCAAAGGAG GTTACCTGCA GGGAAATGTT 60 

AACGQQAQGC TGCCTTCCCT GGGCAACAAG GAGCCACCTG GGCAGGAGAA AGTGCAGCTG 120 

AAGAGGAAAG TCACTTTACT GAGGGGAGTC TCCATTATCA TTGGCACCAT CATTGGAGCA 180 

GGAATCTTCA TCTCTCCTAA GGGCGTGCTC CAGAACACGG GCAGCGTGGG CATGTCTCTG 240 

ACCATCTGGA CGGTGTGTGG GGTCCTGTCA CTATTTGGAG CTTTGTCTTA TGCTGAATTG 300 

GGAACAACTA TAAAGAAATC TGGAGGTCAT TACACATATA TTTTGGAAGT CTTTGGTCCA 360 

TTACCAGCTT TTGTACGAGT CTGGGTGGAA CTCCTCATAA TACGCCCTGC AGCTACTGCT 420 

GTGATATCCC TGGCATTTGG ACGCTACATT CTGGAACCAT TTTTTATTCA ATGTGAAATC 480 

CCTGAACTTG CGATCAAGCT CATTACAGCT GTGGGCATAA CTGTAGTGAT GGTCCTAAAT 540 

AGCATGAGTG TCAGCTGGAG GGCCCGGATC CAGATTTTCT TAACCTTTTG CAAGCTCACA 600 

GCAATTCTQA TAATTATAGT CCCTGGAGTT ATGCAGCTAA TTAAAGGTCA AACGCAGAAC 660 

TTTAAAGACG CGTTTTCAGG AAGAGATTCA AGTATTACGC GGTTGOCACT GGCTTTTTAT 720 

TATGGAATGT ATGCATATGC TGGCTGGTTT TACCTCAACT TTGTTACTGA AGAAGTAGAA 780 

AACCCTGAAA AAACCATTCC CCTTGCAATA TGTATATCCA TGGCCATTGT CACCATTGGC 840 

TATGTGCTGA CAAATGTGGC CTACTTTAOG ACCATTAATG CTGAGGAGC7 GCTGCTTTCA 900 

AATGCAGTGG CAGTGACCTT TTCTGAGCGG CTACTGGGAA ATTTCTCATT AGCAGTTCCG 960 

ATCTTTGTTG CCCTCTCCTG CTTTGGCTCC ATGAACGGTG GTGTGTTTGC TGTCTCCAGG 1020 

TTATTCTATG TTGCGTCTCG AGAGGGTCAC CTTCCAGAAA TCCTCTCCAT GATTCATGTC 1080 

CGCAAGCACA CTCCTCTACC AGCTGTTATT GTTTTGCACC CTTTGACAAT GATAATGCTC 1140 

TTCTCTGGAG ACCTCGACAG TCTTTTGAAT TTCCTCAGTT TTGCCAGGTG GCTTTTTATT 1200 

GGGCTGGCAG TTGCTGGGCT GATTTATCTT CGATACAAAT GCCCAGATAT GCATCGTCCT 1260 

TTCAAGGTGC CACTGTTCAT CCCAGCTTTG TTTTCCTTCA CATGCCTCTT CATGGTTGCC 1320 

CTTTCCCTCT ATTCGGACCC ATTTAGTACA GGGATTGGCT TCGTCATCAC TCTGACTGGA 1380 

GTCCCTGCGT ATTATCTCTT TATTATATGG GACAAGAAAC CCAGGTGGTT TAGAATAATG 1440 

TCAGAGAAAA TAACCAGAAC ATTACAAATA ATACTGGAAG TTGTACCAGA AGAAGATAAG 1500 

TTATGAACTA ATGGACTTGA GATCTTGGCA ATCTGCCCAA GGGGAGACAC AAAATAGGGA 1560 

TTTTTACTTC ATTTTCTGAA AGTCTAGAGA ATTACAACTT TGGTGATAAA CAAAAGGAGT 1620 

CAGTTATTTT TATTCATATA TTTTAGCATA TTCGAACTAA TTTCTAAGAA ATTTAGTTAT 1680 

AACTCTATGT AGTTATAGAA AGTGAATATG CAGTTATTCT ATGAGTCGCA CAATTCTTGA 1740 

GTCTCTGATA CCTACCTATT GGGGTTAGGA GAAAAGACTA GACAATTACT ATGTGGTCAT 1800 

TCTCTACAAC ATATGTTAGC ACGGCAAAGA ACCTTCAAAT TGAAGACTGA GATTTTTCTG 1860 

TATATATGGG TTTTGTAAAG ATGGTTTXAC ACACTACAGA TGTCTATACT GTGAAAAGTG 1920 

TTTTCAATTC TGAAAAAAAG CATACATCAT GATTATGGCA AAGAGGAGAG AAAGAAATTT 1980 

ATTTTACATT GACATTGCAT TGCTTCCCCT TAGATACCAA TTTAGATAAC AAACACTCAT 2040 

GCTTTAATGG ATTATACCCA GAGCACTTTG AACAAAGGTC AGTGGGGATT GTTGAATACA 2100 

TTAAAGAAGA GTTTCTAGGG GCTACTGTTT ATGAGACACA TCCAGGAGTT ATGTTTAAGT 2160 

AAAAATCCTT GAGAATTTAT TATGTCAGAT GTTTTTTCAT TCATTATCAG GAAGTTTTAG 2220 

TTATCTGTCA TTTTTTTTTT TCACATCAGT TTGATCAGGA AAGTGTATAA CACATCTTAG 2280 

AGCAAGAGTT AGTTTGGTAT TAAATCCTCA TTAGAACAAC CACCTGTTTC ACTAATAACT 2340 

TACCCCTGAT GAGTCTATCT AAACATATGC ATTTTAAGCC TTCAAATTAC ATTATCAACA 2400 

TGAGAGAAAT AACCAACAAA GAAGATGTTC AAAATAATAG TCCCATATCT GTAATCATAT 2460 

CTACATGCAA TGTTAGTAAT TCTGAAGTTT TTTAAATTTA TGGCTATTTT TACACGATGA 2520 

TGAATTTTGA CAGTTTGTGC ATTTTCTTTA TACATTTTAT ATTCTTCTGT TAAAATATCT 2580 

CTTCAGATGA AACTGTCCAG ATTAATTAGG AAAAGGCATA TATTAACATA AAAATTGCAA 2640 

AAGAAATGTC GCTGTAAATA AGATTTACAA CTGATGTTTC TAGAAAATTT CCACTTCTAT 2700 

ATCTAGGCTT TGTCAGTAAT TTCCACACCT TAATTATCAT TCAACTTGCA AAAGAGACAA 2760 

CTGATAAGAA GAAAATTGAA ATGAGAATCT GTGGATAAGT GTTTGTGTTC AGAAGATGTT 2820 

GTTTTGCCAG TATTAGAAAA TACTGTGAGC CGGGCATGGT GGCTTACATC TGTAATCCCA 2880 

GCACTTTGGG AGGCTGAGGG GGTGGATCAC CTGAGGTCGG GAGTTCTAGA CCAGCCTGAC 2940 

CAACATGGAG AAACCCCATC TCTACTAAAA ATACAAAATT AGCTGGGCAT GGTGGCACAT 3000 

GCTGGTAATC TCAGCTATTG AGGAGGCTGA GGCAGGAGAA TTGCTTGAAC CCGGGAGGCG 3060 

GAGGTTGCAG TGAGCCAAGA TTGCACCACT GTACTCCAGC CTGGGTGACA AAGTCAGACT 3120 
CCATCTCCAA AAAAAAAAAA AAAA 

Seq ID NO: 395 Protein sequence 
Protein Accession #: NP_055146.1 

1 11 > 21 31 41 51 

I I I I I I 

MVRKPWSTI SKGGYLQGNV NGRLPSLGNK EPPGQEKVQL KRKVTLLRGV SIIIGTIIGA 60 

GIFISPKGVL QNTGSVGMSL TIWTVCGVLS LFGALSYAEL GTTIKKSGGH YTYILEVFGP 120 

LPAPVRVWVE LLIIRPAATA VISLAFGRYI LEPFPIQCEI PELAIKLITA VGITWMVLN 180 

SMSVSWSARI QIFLTFCKLT AILIIIVPGV MQLIKGQTQN FKDAFSGRDS SITRLPLAPY 240 

YGMYAYAGWP YLNFVTEEVE NPEKTIPLAI CISMAITIGV YVLTNVAYFT TINAEELLLS 300 

NAVAVTFSER LLGNFSLAVP IFVALSCFGS MNGGVFAVSR LFYVASREGH LPEILSMIHV 360 

RKHTPLPAVI VLHPLTMIML FSGDLDSLLN FLSFARWLFI GLAVAGLIYL RYKCPDMHRP 420 

FKVPLPIPAL FSFTCLFMVA LSLYSDPFST GIGFVITLTG VPAYYLFIIW DKKPRWFRIK 480 
SEKITRTLQI IIjEWPEEDK h 



Seq ID NOt 396 DNA sequence 
Nucleic Acid Accession #: NM_006528 
Coding sequences 5 7.. 7 64 

1 11 21 31 41 51 

I I I I I I 

GCCGCCAGOG GCTTTCTCGG ACGCCTTGCC CAGCGGGCCG CCCGACCCCC TGCACCATGG 60 

ACCCCGCTCG CCCCCTGGGG CTGTCGATTC TGCTGCTTTT CCTGACGGAG GCTGCACTGG 120 

GCGATGCTGC TCAGGAGCCA ACAGGAAATA ACGCGGAGAT CTGTCTCCTG CCCCTAGACT 180 

ACGGACCCTQ CCGGGCCCTA CTTCTCCGTT ACTACTACGA CAGGTACACG CAGAGCTGCC 240 

GCCAGTTCCT GTACGGGGGC TGCGAGGGCA ACGCCAACAA TTTCTACACC TGGGAGGCTT 300 

GCGACGATGC TTGCTGGAGG ATAGAAAAAQ TTCCCAAAGT TTGCCGGCTG CAAGTGAGTG 360 

TGGAOGACCA GTGTGAGGGG TCCACAGAAA AGTATTTCTT TAATCTAAGT TCCATGACAT 420 

GTGAAAAATT CTTTTCCGGT GGGTGTCACC GGAACCGGAT TGAGAACAGG TTTCCAGATQ 480 

AAGCTACTTG TATGGGCTTC TGCGCACCAA AGAAAATTCC ATCATTTTGC TACAGTCCAA 540 

AAGATGAGGG ACTGTGCTCT GCCAATGTGA CTCGCTATTA TTTTAATCCA AGATACAGAA 600 

CCTGTGATGC TTTCACCTAT ACTGGCTGTG GAGGGAATGA CAATAACTTT GTTAGCAGGG 660 
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AGGATTGCAA ACGTGCATGT GCAAAAGCTT TGAAAAAGAA AAAGAAGATG CCAAAGCTTC 720 

GCTTTGCCAG TAGAATCCGG AAAATTCGGA AGAAGCAATT TTAAACATTC TTAATATGTC 780 

ATCTTGTTTG TCTTTATGGC TTATTTGCCT TTATGGTTGT ATCTGAAGAA TAATATGACA 840 

GCATGAGGAA ACAAATCATT GGTGATTTAT TCACCAGTTT TTATTAATAC AAGTCACTTT 900 

TTCAAAAATT TGGATTTTTT TATATATAAC TAGCTGCTAT TCAAATGTGA GTCTACCATT 960 

TTTAATTTAT GGTTCAACTG TTTGTGAGAC GAATTCTTGC AATGCATAAG ATATAAAAGC 1020 

AAATATGACT CACTCATTTC TTGGGGTCGT ATTCCTGATT TCAGAAGAGG ATCATAACTG 1080 

AAACAACATA AGACAATATA ATCATGTGCT TTTAACATAT TTGAGAATAA AAAGGACTAG 1140 
CC 



Seq ID NO: 397 Protein sequence 
Protein Accession #: NP_006519 

1 ll 21 31 41 51 

I I I I I I 

MDPARPLGLS ILLLFLTEAA LGDAAQEPTG NNAEICLLPL DYGPCRALLL RYYYDRYTQS 60 

CRQFLYGGCE GNANNFYTWE ACDDACWRIE KVPKVCRLQV SVDDQCEGST EKYFFNLSSM 120 

TCEKFFSGGC HRNRIENRFP DEATCMGFCA PKKIPSFCYS PKDEGLCSAN VTRYYFNPRY 180 
RTCDAFTYTG CGGNDNNFVS REDCKRACAK ALKKKKKMPK LRFASRIRKI RKKQF 

Seq ID NO: 398 DNA sequence 

Nucleic Acid Accession #: NM_001508.1 

Coding sequence: 1..1361 

1 11 21 31 41 51 

| | I I I I 

ATGGCTTCAC CCAGCCTCCC GGGCAGTGAC TGCTCCCAAA TCATTGATCA CAGTCATGTC 60 

CCCGAGTTTG AGGTGGCCAC CTGGATCAAA ATCACCCTTA TTCTGGTGTA CCTGATCATC 120 

TTCGTGATGG GCCTTCTGGG GAACAGCGTC ACCATTCGGG TCACCCAGGT GCTGCAGAAG 180 

AAAGGATACT TGCAGAAGGA GGTGACAGAC CACATGGTGA GTTTGGCTTG CTCGGACATC 240 

TTGGTGTTCC TCATCGGCAT GCCCATGGAG TTCTACAGCA TCATCTGGAA TCCCCTGACC 300 

ACGTCCAGCT ACACCCTGTC CTGCAAGCTG CACACTTTCC TCTTCGAGGC CTGCAGCTAC 360 

GCTACGCTGC TGCACGTGCT GACGCTCAGC TTTGAGCGCT ACATCGCCAT CTGTCACCCC 420 

TTCAGGTACA AGGCTGTGTC GGGACCTTGC CAGGTGAAGC TGCTGATTGG CTTCGTCTGG 480 

GTCACCTCCG CCCTGGTGGC ACTGCCCTTG CTGTTTGCCA TGGGTACTGA GTACCCCCTG 540 

GTGAACGTGC CCAGCCACCG GGGTCTCACT TGCAACCGCT CCAGCACCCG CCACCACGAG 600 

CAGCCCGAGA CCTCCAATAT GTCCATCTGT ACCAACCTCT CCAGCCGCTG GACCGTGTTC 660 

CAGTCCAGCA TCTTCGGCGC CTTCGTGGTC TACCTCGTGG TCCTGCTCTC CGTAGCCTTC 720 

ATGTGCTGGA ACATGATGCA GGTGCTCATG AAAAGCCAGA AGGGCTCGGT GGCCGGGGGC 780 

ACGCGGCCTC CGCAGCTGAG GAAGTCCGAG AGCGAAGAGA GCAGGACCGC CAGGAGGCAG 840 

ACCATCATCT TCCTGAGGCT GATTGTTGTG ACATTGGCCG TATGCTGGAT GCCCAACCAG 900 

ATTCGGAGGA TCATGGCTGC GGCCAAACCC AAGCACGACT GGACGAGGTC CTACTTCCGG 960 

GCGTACATGA TCCTCCTCCC CTTCTCGGAG ACGTTTTTCT ACCTCAGCTC GGTCATCAAC 1020 

CCGCTCCTGT ACACGGTGTC CTCGCAGCAG TTTCGGCGGG TGTTCGTGCA GGTGCTGTGC 1080 

TGCCGCCTGT CGCTGCAGCA CGCCAACCAC GAGAAGCGCC TGCGCGTACA TGCGCACTCC 1140 

ACCACCGACA GCGCCCGCTT TGTGCAGCGC CCGTTGCTCT TCGCGTCCCG GCGCCAGTCC 1200 

TCTGCAAGGA GAACTGAGAA GATTTTCTTA AGCACTTTTC AGAGCGAGGC CGAGCCCCAG 1260 

TCTAAGTCCC AGTCATTGAG TCTCGAGTCA CTAGAGCCCA ACTCAGGCGC GAAACCAGCC 1320 
AATTCTGCTG CAGAGAATGG TTTTCAGGAG CATGAAGTTT GA 

Seq ID NO: 399 Protein Bequence 
Protein Accession #: NPJ)01499.1 

1 11 21 31 41 51 

I I ! 1 I I 

MASPSLPGSD CSQIIDHSHV PEFEVATWIK ITLIIiVYLII FVMGLLGNSV TIRVTQVLQK 60 

KGYLQKEVTD HMVSLACSDI LVFLIGMPME FYSIIWNPLT TSSYTLSCKI* HTFLFEACSY 120 

ATLLHVLTLS FERYIAICHP FRYKAVSGPC QVKLLIGFVW VTSALVALPL LFAMGTEYPL 180 

VNVPSHRGLT CNRSSTRHHE QPETSNMSIC TNLSSRWTVF QSSIFGAFW YLWLLSVAF 240 

MCWNMMQVLM KSQKGSLAGG TRPPQLRKSE SEESRTARRQ TIIFIiRLIW TLAVCWMPNQ 300 

IRRIMAAAKP KHDWTRSYFR AYMILLPFSE TFFYLSSVIN PLLYTVSSQQ FRRVFVQVLC 360 

CRLSLQHANH EKRLRVHAHS TTDSARFVQR PLLFASRRQS SARRTEKIFL STFQSEAEPQ 420 
SKSQSLSLES LEPNSGAKPA NSAAENGFQE HEV 

Seq ID NO: 400 DNA sequence 

Nucleic Acid Accession #: NM_006475.1 

Coding sequence: 2 8.. 2 53 8 

1 11 21 31 41 51 

I I I I P I 

AACAGAACTG CAACGGAGAG ACTCAAGATG ATTCCCTTTT TACCCATGTT TTCTCTACTA 60 

TTGCTGCTTA TTGTTAACCC TATAAACGCC AACAATCATT ATGACAAGAT CTTGGCTCAT 120 

AGTCGTATCA GGGGTCGGGA CCAAGGCCCA AATGTCTGTG CCCTTCAACA GATTTTGGGC 180 

ACCAAAAAGA AATACTTCAG CACTTGTAAG AACTGGTATA AAAAGTCCAT CTGTGGACAG 240 

AAAACGACTG TTTTATATGA ATGTTGCCCT GGTTATATGA GAATGGAAGG AATGAAAGGC 300 

TGCCCAGCAG TTTTGCCCAT TGACCATGTT TATGGCACTC TGGGCATCGT GGGAGCCACC .360 

ACAACGCAGC GCTATTCTGA CGCCTCAAAA CTGAGGGAGG AGATCGAGGG AAAGGGATCC 420 

TTCACTTACT TTGCACCGAG TAATGAGGCT TGGGACAACT TGGATTCTGA TATCCGTAGA 480 

GGTTTGGAGA GCAACGTGAA TGTTGAATTA CTGAATGCTT TACATAGTCA CATGATTAAT 540 

AAGAGAATGT TGACCAAGGA CTTAAAAAAT GGCATGATTA TTCCTTCAAT GTATAACAAT 600 

TTGGGGCTTT TCATTAACCA TTATCCTAAT GGGGTTGTCA CTGTTAATTG TGCTCGAATC 660 

ATCCATGGGA ACCAGATTGC AACAAATGGT GTTGTCCATG TCATTGACCG TG TGCTTA CA 720 

CAAATTGGTA CCTCAATTCA AGACTTCATT GAAGCAGAAG ATGACCTTTC ATCTTTTAGA 780 

GCAGCTGCCA TCACATCGGA CATATTGGAG GCCCTTGGAA GAGACGGTCA CTTCACACTC 840 

TTTGCTCCCA CCAATGAGGC TTTTGAGAAA CTTCCACGAG GTGTCCTAGA AAGGTTCATG 900 

GGAGACAAAG TGGCTTCCGA AGCTCTTATG AAGTACCACA TCTTAAATAC TCTCCAGTGT 960 

TCTGAGTCTA TTATGGGAGG AGCAGTCTTT GAGACGCTGG AAGGAAATAC AATTGAGATA 1020 
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GGATGTGACG GTGACAGTAT AACAGTAAAT GGAATCAAAA TGGTGAACAA AAAGGATATT 1080 

GTQACAAATA ATGGTGTGAT CCATTTGATT GATCAGGTCC TAATTCCTGA TTCTGCCAAA 1140 

CAAGTTATTG AGCTGGCTGG AAAACAGCAA ACCACCTTCA CGGATCTTGT GGCCCAATTA 1200 

GGCTTGGCAT CTGCTCTGAG GCCAGATGGA GAATACACTT TGCTGGCACC TGTGAATAAT 1260 

GCATTTTCTG ATGATACTCT CAGCATGGTT CAGCGCCTCC TTAAATTAAT TCTGCAGAAT 1320 

CACATATTGA AAGTAAAAGT TGGCCTTAAT GAGCTTTACA ACGGGCAAAT ACTGGAAACC 1380 

ATCGGAGGCA AACAGCTCAG AGTCTTCGTA TATCGTACAG CTGTCTGCAT TGAAAATTCA 1440 

TGCATGGAGA AAGGGAGTAA GCAAGGGAGA AACGGTGCGA TTCACATATT CCGCGAGATC 1500 

ATCAAGCCAG CAGAGAAATC CCTCCATGAA AAGTTAAAAC AAGATAAGCG CTTTAGCACC 1560 

TTCCTCAGCC TACTTGAAGC TGCAGACTTG AAAGAGCTCC TGACACAACC TGGAGACTGG 1620 

ACATTATTTG TGCCAACCAA TGATGCTTTT AAGGGAATGA CTAGTGAAGA AAAAGAAATT 1680 

CTGATACGGG ACAAAAATGC TCTTCAAAAC ATCATTCTTT ATCACCTGAC ACCAGGAGTT 1740 

TTCATTGGAA AAGGATTTGA ACCTGGTGTT ACTAACATTT TAAAGACCAC ACAAGGAAGC 1800 

AAAATCTTTC TGAAAGAAGT AAATGATACA CTTCTGGTGA ATGAATTGAA ATCAAAAGAA 1860 

TCTGACATCA TGACAACAAA TGGTGTAATT CATGTTGTAG ATAAACTCCT CTATCCAGCA 1920 

GACACACCTG TTGGAAATGA TCAACTGCTG GAAATACTTA ATAAATTAAT CAAATACATC 1980 

CAAATTAAGT TTGTTCGTGG TAGCACCTTC AAAGAAATCC CCGTGACTGT CTATACAACT 2040 

AAAATTATAA CCAAAGTTGT GGAACCAAAA ATTAAAGTGA TTGAAGGCAG TCTTCAGCCT 2100 

ATTATCAAAA CTGAAGGACC CACACTAACA AAAGTCAAAA TTGAAGGTGA ACCTGAATTC 2160 

AGACTGATTA AAGAAGGTGA AACAATAACT GAAGTGATCC ATGGAGAGCC AATTATTAAA 2220 

AAATACACCA AAATCATTGA TGGAGTGCCT GTGGAAATAA CTGAAAAAGA GACACGAGAA 2280 

GAACGAATCA TTACAGGTCC TGAAATAAAA TACACTAGGA TTTCTACTGG AGGTGGAGAA 2340 

ACAGAAGAAA CTCTGAAGAA ATTGTTACAA GAAGAGGTCA CCAAGGTCAC CAAATTCATT 2400 

GAAGGTGGTG ATGGTCATTT ATTTGAAGAT GAAGAAATTA AAAGACTGCT TCAGGGAGAC 2460 

ACACCCGTGA GGAAGTTGCA AGCCAACAAA AAAGTTCAAG GTTCTAGAAG ACGATTAAGG 2520 

GAAGGTCGTT CTCAGTGAAA ATCCAAAAAC CAGAAAAAAA TGTTTATACA ACCCTAAGTC 2580 

AATAACCTGA CCTTAGAAAA TTGTGAGAGC CAAGTTGACT TCAGGAACTG AAACATCAGC 2640 

ACAAAGAAGC AATCATCAAA TAATTCTGAA CACAAATTTA ATATTTTTTT TTCTGAATGA 2700 

GAAACATGAG GGAAATTGTG GAGTTAGCCT CCTGTGGTAA AGGAATTGAA GAAAATATAA 2760 

CACCTTACAC CCTTTTTCAT CTTGACATTA AAAGTTCTGG CTAACTTTGG AATCCATTAG 2820 

AGAAAAATCC TTGTCACCAG ATTCATTACA ATTCAAATCG AAGAGTTGTG AACTGTTATC 2880 

CCATTGAAAA GACCGAGCCT TGTATGTATG TTATGGATAC ATAAAATGCA CGCAAGCCAT 2940 

TATCTCTCCA TGGGAAGCTA AGTTATAAAA ATAGGTGCTT GGTGTACAAA ACTTTTTATA 3000 

TCAAAAGGCT TTGCACATTT CTATATGAGT GGGTTTACTG GTAAATTATG TTATTTTTTA 3060 

CAACTAATTT TGTACTCTCA GAATGTTTGT CATATGCTTC TTGCAATGCA TATTTTTTAA 3120 

TCTCAAACGT TTCAATAAAA CCATTTTTCA GATATAAAGA GAATTACTTC AAATTGAGTA 3180 
ATTCAGAAAA ACTCAAGATT TAAGTTAAAA AGTGGTTTGG ACTTGGGAA 

Seq ID NO; 401 Protein sequence 
Protein Accession ft: NP_006466.1 

1 11 21 31 41 51 

I I 1 I I I 

MIPFLPMFSL LLLLIVNPIN ANNHYDKILA HSRIRGRDQG PNVCALQQIL GTKKKYFSTC 60 

KNWYKKSICG QKTTVLYECC PGYMRMEGMK GCPAVLPIDH VYGTLGIVGA TTTQRYSDAS 120 

KLREEIEGKG SFTYFAPSNE AWDNLDSDIR RGLESNVNVE LLNALHSHMI NKRMLTKDLK 180 

NGMIIPSMYN NLGLFINHYP NGWTVNCAR IIHGNQIATN GWHVIDRVL TQIGTSIQDF 240 

IEAEDDLSSF RAAAITSDIIi EALGRDGHFT LFAPTNEAFE KLPRGVLERF MGDKVASEAL 300 

MKYHILNTLQ CSESIMGGAV FETLEGNTIE IGCDGDSITV NGIKMVNKKD IVTNNGVIHL 360 

IDQVLIPDSA KQVIELAGKQ QTTFTDLVAQ LGLASALRPD GEYTLLAPVN NAFSDDTLSM 420 

VQRI1I1KLILQ NHILKVKVGL NELYNGQILE TIGGKQLRVF VYRTAVCIEN SCMEKGSKQG 480 

RNGAIHIFRE IIKPAEKSLH EKLKQDKRFS TFLSLLEAAD LKELLTQPGD WTLFVPTNDA 540 

FKGMTSEEKE ILIRDKNALQ NIILYHLTPG VFIGKGFEPG VTNILKTTQG SKI FLKEVND 600 

TLLVNELKSK ESDIMTTNGV IHWDKLLYP ADTPVGNDQL LEILNKLIKY IQIKFVRGST 660 

FKEIPVTVYT TKIITKWEP KIKVIEGSLQ PIIKTEGPTL TKVKIEGEPE FRtilKEGETI 720 

TEVIKGEPII KKYTKI IDGV PVEITEKETR EERIITGPEI KYTRISTGGG ETEETLKKLI* 780 
QEEVTKVTKF IEGGDGHLFE DEEIKRLLQG DTPVRKLQAN KKVQGSRRRL REGRSQ 

Seq ID NO: 402 DNA sequence 

Nucleic Acid Accession #: NM_002416 

Coding sequence: 40.. 417 

1 11 21 31 41 51 

I I I I I I 

ATCCAATACA GGAGTGACTT GGAACTCCAT TCTATCACTA TGAAGAAAAG TGGTGTTCTT 60 

TTCCTCTTGG GCATCATCTT GCTGGTTCTG ATTGGAGTGC AAGGAACCCC AGTAGTGAGA 120 

AAGGGTCGCT GTTCCTGCAT CAGCACCAAC CAAGGGACTA TCCACCTACA ATCCTTGAAA 180 

GACCTTAAAC AATTTGCCCC AAGCCCTTCC TGCGAGAAAA TTGAAATCAT TGCTACACTG 240 

AAGAATGGAG TTCAAACATG TCTAAACCCA GATTCAGCAG ATGTGAAGGA ACTGATTAAA 300 

AAGTGGGAGA AACAGGTCAG CCAAAAGAAA AAGCAAAAGA ATGGGAAAAA ACATCAAAAA 360 

AAGAAAGTTC TGAAAGTTCG AAAATCTCAA CGTTCTCGTC AAAAGAAGAC TACATAAGAG 420 

ACCACTTCAC CAATAAGTAT TCTGTGTTAA AAATGTTCTA TTTTAATTAT ACCGCTATCA 480 

TTCCAAAGGA GGATGGCATA TAATACAAAG GCTTATTAAT TTGACTAGAA AATTTAAAAC 540 

ATTACTCTGA AATTGTAACT AAAGTTAGAA AGTTGATTTT AAGAATCCAA ACGTTAAGAA 600 

TTGTTAAAGG CTATGATTGT CTTTGTTCTT CTACCACCCA CCAGTTGAAT TTCATCATGC 660 

TTAAGGCCAT GATTTTAGCA ATACCCATGT CTACACAGAT GTTCACCCAA CCACATCCCA 720 

CTCACAACAG CTGCCTGGAA GAGCAGCCCT AGGCTTCCAC GTACTGCAGC CTCCAGAGAG 780 

TATCTGAGGC ACATGTCAGC AAGTCCTAAG CCTGTTAGCA TGCTGGTGAG CCAAGCAGTT 840 

TGAAATTGAG CTGGACCTCA CCAAGCTGCT GTGGCCATCA ACCTCTGTAT TTGAATCAGC 900 

CTACAGGCCT CACACACAAT GTGTCTGAGA GATTCATGCT GATTGTTATT GGGTATCACC 960 

ACTGGAGATC ACCAGTGTGT GGCTTTCAGA GCCTCCTTTC TGGCTTTGGA AGCCATGTGA 1020 

TTCCATCTTG CCCGCTCAGG CTGACCACTT TATTTCTTTT TGTTCCCCTT TGCTTCATTC 1080 

AAGTCAGCTC TTCTCCATCC TACCACAATG CAGTGCCTTT CTTCTCTCCA GTGCACCTGT 1140 

CATATGCTCT GATTTATCTG AGTCAACTCC TTTCTCATCT TGTCCCCAAC ACCCCACAGA 1200 

AGTGCTTTCT TCTCCCAATT CATCCTCACT CAGTCCAGCT TAGTTCAAGT CCTGCCTCTT 1260 

AAATAAACCT TTTTGGACAC ACAAATTATC TTAAAACTCC TGTTTCACTT GGTTCAGTAC 1320 

CACATGGGTG AACACTCAAT GGTTAACTAA TTCTTGGGTG TTTATCCTAT CTCTCCAACC 1380 
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AGATTGTCAG CTCCTTGAGG GCAAGAGCCA CAGTATATTT CCCTGTTTCT TCCACAGTGC 1440 

CTAATAATAC TGTGGAACTA GGTTTTAATA ATTTTTTAAT TGATGTTGTT ATGGGCAGGA 1500 

TGGCAACCAG ACCATTGTCT CAGAGCAGGT GCTGGCTCTT TCCTGGCTAC TCCATGTTGG 1560 

CTAGCCTCTG GTAACCTCTT ACTTATTATC TTCAGGACAC TCACTACAGG GACCAGGGAT 1620 

GATGCAACAT CCTTGTCTTT TTATGACAGG ATGTTTGCTC AGCTTCTCCA AC AATA AGAA 1680 

GCACGTGGTA AAACACTTGC GGATATTCTG GACTGTTTTT AAAAAATATA CAGTTTACCG 1740 

AAAATCATAT AATCTTACAA TGAAAAGGAC TTTATAGATC AGCCAGTGAC CAACCTTTTC 1800 

CCAACCATAC AAAAATTCCT TTTCCCGAAG GAAAAGGGCT TTCTCAATAA GCCTCAGCTT 1860 

TCTAAGATCT AACAAGATAG CCACCGAGAT CCTTATCGAA ACTCATTTTA GGCAAATATG 1920 

AGTTTTATTG TCCGTTTACT TGTTTCAGAG TTTGTATTGT GATTATCAAT TACCACACCA 1980 

TCTCCCATGA AGAAAGGGAA CGGTGAAGTA CTAAGCGCTA GAGGAAGCAG CCAAGTCGGT 2040 

TAGTGGAAGC ATGATTGGTG CCCAGTTAGC CTCTGCAGGA TGTGGAAACC TCCTTCCAGG 2100 

GGAGGTTCAG TGAATTGTGT AGGAGAGGTT GTCTGTGGCC AGAATTTAAA CCTATACTCA 2160 

CTTTCCCAAA TTGAATGACT GCTCACACTG CTGATGATTT AGAGTGCTGT COGGTGGAGA 2220 

TCCCACCCGA ACGTCTTATC TAATCATGAA ACTCCCTAGT TCCTTCATGT AACTTCCCTG 2280 

AAAAATCTAA GTGTTTCATA AATTTGAGAG TCTGTGACCC ACTTACCTTG CATCTCACAG 2340 

GTAGACAGTA TATAACTAAC AACCAAAGAC TACATATTGT CACTGACACA CACGTTATAA 2400 

TCATTTATCA TATATATACA TACATGCATA CACTCTCAAA GCAAATAATT TTTCACTTCA 2460 

AAACAGTATT GACTTGTATA CCTTGTAATT TGAAATATTT TCTTTGTTAA AATAGAATGG 2520 
TATCAATAAA TAGACCATTA ATCAG 



Seq ID NO: 403 Protein sequence 
Protein Accession #: NP_002407 

1 11 21 31 41 51 

I I I I 1 I 

MKKSGVLFLL GIILIjVLIGV QGTPWRKGR CSCISTNQGT IHLQSLKDLK QFAPSPSCBK 60 
ISIIATLKNG VQTCLNPDSA DVKELIKKWE KQVSQKKKQK NGKKHQKKKV LKVRKSQRSR 120 
QKKTT 



Seq ID NO i 404 dna sequence 

Nucleic Acid Accession Jh NM_006670 

Coding sequence: 85.. 1347 



1 11 21 

I t I 

CCGGCTCGCG CCCTCCGGGC CCAGCCTCCC 
AGCTCCGGGG AAACGCGAGC CGCGATGCCT 
GACGGGCGTC TGCGGCTGGC GCGACTAGCG 
TCTCCCACCT CCTCGGCATC CTCCTTCTCC 
TCCGCCCAGC CCCCGCTGCC GGACCAGTGC 
CGCACAGTCA AGTGCGTTAA CCGCAATCTG 
GTGCGCAACC TCTTCCTTAC CGGCAACCAG 
CGCCGGCCGC CGCTGGCGGA GCTGGCCGCG 
GTGCGCGCGG GCGCCTTCGA GCATCTGCCC 
CCACTGGCCG ACCTCAGTCC CTTCGCTTTC 
AGTCCCCTTG TGGAACTGAT CCTGAACCAC 
CGGAGCTTCG AGGGCATGGT GGTGGCGGCC 
CGCCGCTTGG AGCTGGCCAG CAACCACTTC 
CTGCCCAGCC TCAGGCACCT GGACTTAAGT 
TCCTTCCGCA ACCTGACACA TCTAGAAAGC 
CTTCACAATG GCACCCTGGC TGAGTTGCAA 
AACAATCCCT GGGTCTGCGA CTGCCACATG 
GAGGTAGTGC AGGGCAAAGA CCGGCTCACC 
GTCCTCTTGG AACTCAACAG TGCTGACCTG 
CAAACCTCTT ATGTCTTCCT GGGTATTGTT 
GTTTTGTATT TGAACCGCAA GGGGATAAAA 
AGGGATCACA TGGAAGGGTA TCATTACAGA 
AACCTCAGTT CTAACTCGGA- TGTCTGAGAA 
CATGAGATGT AGACTTAAGC TTTATCCCTA 
TAGATACAAC GGACTTTGAC TAAAAGCAGT 
TTTCTCGGTG TGTTCTGTTA ATGTAAGACG 
TTCTTTTTCT TGGAACTCCT CAACACGTAT 
TGGGCTTCTT GCTGTCTGTC TCTCTCTCAG 
ACAGATAGCA TTCAACAAAA GCTGCCTCAA 
TATCAGTTTT ATTCTCATGT ACCTAAGTTG 
CTGCAGACGT TAGCAGGCTC TTCAAAATAA 
AGAGCATGCT TACATTTTAC TGTTCTGCAT 
TTCTTTGACA AAGTAAATTA CTTTTTTGAT 
TTTTAATAAA CTGCATCGAG ATCCAACCGA 
ATTCTTAAAA GAA 



31 41 51 

III" 
GAGCCTTCGG AGCGGGCGCC GTCCCAGCCC 60 
GGGGGGTGCT CCCGGGGCCC CGCCGCCGGG 120 
CTGGTACTCC TGGGCTGGGT CTCCTCGTCT 180 
TCCTCGGCGC CGTTCCTGGC TTCCGCCGTG • 240 
CCCGCGCTGT GCGAGTGCTC CGAGGCAGCG 3 00 
ACCGAGGTGC CCACGGACCT GCCCGCCTAC 360 
CTGGCCGTGC TCCCTGCCGG CGCCTTCGCC 420 
CTCAACCTCA GCGGCAGCCG CCTGGACGAG 480 
AGCCTGCGCC AGCTCGACCT CAGCCACAAC 540 
TCGGGCAGCA ATGCCAGCGT CTCGGCCCCC 600 
ATCGTGCCCC CTGAAGATGA GCGGCAGAAC 660 
CTGCTGGCGG GCCGTGCACT GCAGGGGCTC 720 
CTTTACCTGC CGCGGGATGT GCTGGCCCAA 780 
AATAATTCGC TGGTGAGCCT GACCTACGTG 840 
CTCCACCTGG AGGACAATGC CCTCAAGGTC 900 
GGTCTACCCC ACATTAGGGT TTTCCTGGAC 960 

GCAGACATGG TGACCTGGCT CAAGGAAACA 1020 

TGTGCATATC CGGAAAAAAT GAGGAATCGG 1080 

GACTGTGACC CGATTCTTCC CCCATCCCTG 1140 

TTAGCCCTGA TAGGCGCTAT TTTCCTCCTG 1200 

AAGTGGATGC ATAACATCAG AGATGCCTGC 1260 

TATGAAATCA ATGCGGACCC CAGATTAACA 1320 

ATATTAGAGG ACAGACCAAG GACAACTCTG 1380 

CTAGGCTTGC TCCACTTTCA TCCTCCACTA 1440 

GAAGGGGATT TGCTTCCTTG TTATGTAAAG 1500 

ATGAACAGTT GTGTATAGTG TTTTACCCTC 1560 

GGAGGGATTT TTCAGGTTTC AGCATGAACA 1620 

TACAGTTCAA GGTGTAGCAA GTGTACCCAC' 1680 

CTTTTTCGAG AAAAATACTT TATTCATAAA 1740 

TGGAGAAAAT AATTGCATCC TATAAACTGC 1800 

CTCCATGGTG CACAGGAGCA CCTGCATCCA 1860 

ATTACAAAAA ATAACTTGCA ACTTCATAAC 1920 

TGCAGTTTAT ATGAAAATGT ACTGATTTTT 1980 

CTGAATTGTT AAAAAAAAAA AAAAATAAAG 2040 



Seq ID NO: 405 Protein sequence 
Protein Accession 8: NP_006661 

1 11 21 31 41 51 

I i I I I I 

MPGGCSRGPA AGDGRLRLAR LAIiVIiLGWVS SSSPTSSASS PSSSAPFLAS AVSAQPPLPD 60 
QCPALCECSE AARTVKCVNR NLTEVPTDLP AYVRNLFLTG NQLAVLPAGA FARRPPLAEL 120 
AALNLSGSRL DEVRAGAFEH LPSLRQLDLS HNPLADLSPF AFSGSNASVS APSPLVELIL 180 
NHIVPPEDER QNRSFEGMW AALLAGRALQ GLRRLELASN HFLYLPRDVL AQLPSLRHLD 240 
LSNNSLVSLT YVSPRNLTHL ESLHLEDNAL KVLHNGTLAE LQGLPHIRVF LENNFWVCDC 300 
HMADMVTWLK BTEWQGKDR LTCAYPEKMR NRVLLELNSA DLDCDPILPP SLQTSYVPLG 360 
IVLALIGAIF LLVLYLNRKG IKKWMHNIRD ACRDHMEGYH YRYEINADPR LTNLSSNSDV 



Seq ID NO: 406 DNA sequence 

Nucleic Acid Accession #: Eos sequence 
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Coding sequence: 1..927 

1 11 21 31 41 51 

I I I I I I 

ATGCCTGGGG GGTGCTCCCG GGGCCCCGCC GCCG6GGACG GGCGTCTGCG GCTGGCGCGA 60 

CTAGOGCTGG TACTCCTGGG CTGGGTCTCC TCGTCTTCTC CCACCTCCTC GGCATCCTCC 120 

TTCTCCTCCT CGGCGOCGTT CCTGGCTTCC GCCGTGTCOG CCCAGCCCCC GCTGCCGGAC 180 

CAGTGCCCCG CGCTGTGCGA GTGCTCCGAG GCAGCGCGCA CAGTCAAGTG CGTTAACOGC 240 

AATCTGACCG AGGTGCCCAC GGACCTGCCC GCCTACGTGC GCAACCTCTT CCTTACCGGC 300 

AACCAGCTGG CCAGCAACCA CTTCCTTTAC CTGCCGCGGG ATGTGCTGGC CCAACTGCCC 360 

AGCCTCAGGC ACCTGGACTT AAGTAATAAT TCGCTGGTGA GCCTGACCTA CGTGTCCTTC 420 

CGCAACCTGA CACATCTAGA AAGCCTCCAC CTGGAGGACA ATGCCCTCAA GGTCCTTCAC 480 

AATGGCACCC TGGCTGAGTT GCAAGGTCTA CCCCACATTA GGGTTTTCCT GGACAACAAT S40 

CCCTGGGTCT GCGACTGCCA CATGGCAGAC ATGGTGACCT GGCTCAAGGA AACAGAGGTA 600 

GTGCAGGGCA AAGACCGGCT CACCTGTGCA TATCCGGAAA AAATGAGGAA TCGGGTCCTC 660 

TTGGAACTCA ACAGTGCTGA CCTGGACTGT GACCCGATTC TTCCCCCATC CCTGCAAACC 720 

TCTTATGTCT TCCTGGGTAT TGTTTTAGCC CTGATAGGCG CTATTTTCCT CCTGGTTTTG 780 

TATTTGAACC GCAAGGGGAT AAAAAAGTGG ATGCATAACA TCAGAGATGC CTGCAGGGAT 840 

CACATGGAAG GGTATCATTA CAGATATGAA ATCAATGCGG ACCCCAGATT AACAAACCTC 900 
AGTTCTAACT CGGATGTCCT CGAGTGA 

Seq ID NO: 407 Protein sequence 
Protein Accession Jh Eos sequence 

1 11 21 31 

1 I I I 

MPGGCSRGPA AGDGRLRLAR LALVLLGWVS SSSPTSSASS 
QCPALCECSE AARTVKCVNR NLTEVPTDLP AYVRNLFLTG 
SLRHLDLSNN SLVSLTYVSF RNLTHLESLH LEDNALKVLH 
PWVCDCHMAD MVTWLKETEV VQGKDRLTCA YPEKMRNRVL 
SYVFLGIVLA LIGAIFLLVL YLNRKGIKKW MHNIRDACRD 
SSNSDVLE 

Seq ID NO: 408 DNA sequence 
Nucleic Acid Accession #; NM_000095.1 
Coding sequence: 26.. 2299 

1 11 21 31 41 51 

I I I I I I 

CAGCACCCAG CTCCCCGCCA CCGCCATGGT CCCCGACACC GCCTGCGTTC TTCTGCTCAC 60 

CCTGGCTGCC CTCGGCGCGT CCGGACAGGG CCAGAGCCCG TTGGGCTCAG ACCTGGGCCC 120 

GCAGATGCTT CGGGAACTGC AGGAAACCAA CGCGGCGCTG CAGGACGTGC GGGACTGGCT 180 

GCGGCAGCAG GTCAGGGAGA TCACGTTCCT GAAAAACACG GTGATGGAGT GTGACGCGTG 240 

CGGGATGCAG CAGTCAGTAC GCACCGGCCT ACCCAGCGTG CGGCCCCTGC TCCACTGCGC .300 

GCCCGGCTTC TGCTTCCCCG GCGTGGCCTG CATCCAGACG GAGAGCGGCG GCCGCTGCGG 360 

CCCCTGCCCC GCGGGCTTCA CGGGCAACGG CTCGCACTGC ACCGACGTCA ACGAGTGCAA 420 

CGCCCACCCC TGCTTCCCCC GAGTCCGCTG TATCAACACC AGCCCGGGGT TCCGCTGCGA 480 

GGCTTGCCCG CCGGGGTACA GCGGCCCCAC CCACCAGGGC GTGGGGCTGG CTTTCGCCAA 540 

GGCCAACAAG CAGGTTTGCA CGGACATCAA CGAGTGTGAG ACCGGGCAAC ATAACTGCGT 600 

CCCCAACTCC GTGTGCATCA ACACCCGGGG CTCCTTCCAG TGCGGCCCGT GCCAGCCCGG 660 

CTTCGTGGGC GACCAGGCGT CCGGCTGCCA GCGCGGCGCA CAGCGCTTCT GCCCCGACGG 720 

CTCGCCCAGC GAGTGCCACG AGCATGCAGA CTGCGTCCTA GAGCGCGATG GCTCGCGGTC 780 

GTGCGTGTGT CGCGTTGGCT GGGCCGGCAA CGGGATCCTC TGTGGTCGCG ACACTGACCT 840 

AGACGGCTTC CCGGACGAGA AGCTGCGCTG CCCGGAGCCG CAGTGCCGTA AGGACAACTG 900 

CGTGACTGTG CCCAACTCAG GGCAGGAGGA TGTGGACCGC GATGGCATCG GAGACGCCTG 960 

CGATCCGGAT GCCGACGGGG ACGGGGTCCC CAATGAAAAG GACAACTGCC CGCTGGTGCG 1020 

GAACCCAGAC CAGCGCAACA CGGACGAGGA CAAGTGGGGC GATGCGTGCG ACAACTGCCG 1080 

GTCCCAGAAG AACGACGACC AAAAGGACAC AGACCAGGAC GGCCGGGGCG ATGCGTGCGA 1140 

CGACGACATC GACGGCGACC GGATCCGCAA CCAGGCCGAC AACTGCCCTA GGGTACCCAA 1200 

CTCAGACCAG AAGGACAGTG ATGGCGATGG TATAGGGGAT GCCTGTGACA ACTGTCCCCA 1260 

GAAGAGCAAC CCGGATCAGG CGGATGTGGA CCACGACTTT GTGGGAGATG CTTGTGACAG 1320 

CGATCAAGAC CAGGATGGAG ACGGACATCA GGACTCTCGG GACAACTGTC CCACGGTGCC 1380 

TAACAGTGCC CAGGAGGACT CAGACCACGA TGGCCAGGGT GATGCCTGCG ACGACGACGA 1440 

CGACAATGAC GGAGTCCCTG ACAGTCGGGA CAACTGCCGC CTGGTGCCTA ACCCCGGCCA 1500 

GGAGGACGCG GACAGGGACG GCGTGGGCGA CGTGTGCCAG GACGACTTTG ATGCAGACAA 1560 

GGTGGTAGAC AAGATCGACG TGTGTCCGGA GAACGCTGAA GTCACGCTCA CCGACTTCAG 1620 

GGCCTTCCAG ACAGTCGTGC TGGACCCGGA GGGTGACGCG CAGATTGACC CCAACTGGGT 1680 

GGTGCTCAAC CAGGGAAGGG AGATCGTGCA GACAATGAAC AGCGACCCAG GCCTGGCTGT 1740 

GGGTTACACT GCCTTCAATG GCGTGGACTT CGAGGGCACG TTCCATGTGA ACACGGTCAC 1800 

GGATGACGAC TATGCGGGCT TCATCTTTGG CTACCAGGAC AGCTCCAGCT TCTACGTGGT 1860 

CATGTGGAAG CAGATGGAGC AAACGTATTG GCAGGCGAAC CCCTTCCGTG CTGTGGCCGA 1920 

GCCTGGCATC CAACTCAAGG CTGTGAAGTC TTCCACAGGC CCCGGGGAAC AGCTGCGGAA 1980 

CGCTCTGTGG CATACAGGAG ACACAGAGTC CCAGGTGCGG CTGCTGTGGA AGGACCCGCG 2040 

AAACGTGGGT TGGAAGGACA AGAAGTCCTA TCGTTGGTTC CTGCAGCACC GGCCCCAAGT 2100 

GGGCTACATC AGGGTGCGAT TCTATGAGGG CCCTGAGCTG GTGGCCGACA GCAACGTGGT 2160 

CTTGGACACA ACCATGCGGG GTGGCCGCCT GGGGGTCTTC TGCTTCTCCC AGGAGAACAT 2220 

CATCTGGGCC AACCTGCGTr ACCGCTGCAA TGACACCATC CCAGAGGACT ATGAGACCCA 2280 

TCAGCTGCGG CAAGCCTAGG GACCAGGGTG AGGACCCGCC GGATGACAGC CACCCTCACC 2340 

GCGGCTGGAT GGGGGCTCTG CACCCAGCCC AAGGGGTGGC CGTCCTGAGG GGGAAGTGAG 2400 
AAGGGCTCAG AGAGGACAAA ATAAAGTGTG TGTGCAGGG 

Seq ID NO: 409 Protein sequence 
Protein Accession #: NP_000086.1 

1 11 21 31 41 51 

I I I I I I 

MVPDTACVLIj ltlaalgasg QGQSPLGSDL gpqmlrelqb TNAALQDVRD WLRQQVREIT 60 



41 51 

I I 

FSSSAPFLAS AVSAQPPLPD 60 

NQLASNHFLY LPRDVLAQLP 120 

NGTLAELQGL PHIRVFLDNN 180 

LELNSADLDC DPILPPSLQT 240 

HMEGYHYRYE INADPRLTNL 300 
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5 

10 
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FLKNTVMECD ACGMQQSVRT GLPSVRPLLH CAPGFCFPGV 
NGSHCTDVNE CNAHPCFPRV RCINTSPGFR CEACPPGYSG 
INECETGQHN CVPNSVCINT RGSFQCGPCQ PGFVGDQASG 
ADCVLERDGS RSCVCRVGWA GNGILC6R0T DLDGPPDEXL 
EDVDRDGIGD ACDPDADGDG VPNEKDNCPL VRNPDQRNTD 
DTDQDGRGDA CDDDIDGDRI RNQADNCPRV PNSDQKDSDG 
VDHDPVGDAC DSDQDQDGDG HQDSRDNCPT VPNSAQEDSD 
RDNCRLVFNP GQEDADRDGV GDVCQDDPDA DKWDKIDVC 
PEGDAQIDPN WWLNQGREI VQTMNSDPGL AVGYTAFNGV 
FGYQDSSSFY WMWKQMEQT YWQANPFRAV AEPGIQLKAV 
ESQVRLLWKD PRNVGWKDKK SYRWFLQHRP QVGYIRVRFY 
RLGVPCFSQE NIIWANLRYR CNDTIPEDYE THQLRQA 

Seq ID NO i 410 DNA sequence 

Nucleic Acid Accession 8: NM_001565.1 

Coding sequence: 67.. 363 



ACIQTESGGR 
PTHQGVGLAF 
OQRGAQRFCP 
RCPEPQCRKD 
EDKHGDACDN 
DGIGDACDNC 
HDGQGBACDD 
PENAEVTtiTD 
DFEGTFHVNT 
KSSTGPGEQL 
EGPELVADSN 



CGPCPAGFTG 
AKANKQVCTD 
DGSP5ECHEH 
NCVTVPNSGQ 
CRSQKNDOQK 
PQKSNPDQAD 
DDDNDGVPDS 
FRAFQTWLD 
VTDDDYAGFI 
RNALWHTGDT 
WLDTTMRGG 



I 

GAGACATTCC 
AGCACCATGA 
ATTCAAGGAG 
CCTGTTAATC 
CGTGTTGAGA 
TCGAAGGCCA 
TAAAACCAGA 
CCTCTCCCAT 
GTTACACTAA 
GGTTAATGTT 
GCTCTACTGA 
ACCTITCCCA 
TCAGAATCTC 
ACTTCATGGA 
CATACAATTC 
CTTATTTAAT 
TTTCAGTGTA 
TTTTAAAAAT 
TTTTCAAATA 



11 
I 

TCAATTGCTT 
ATCAAACTGC 
TACCTCTCTC 
CAAGGTCTTT 
TCATTGCTAC 
TCAAGAATTT 
GGGGAGCAAA 
CACTTCCCTA 
AAGGTGACCA 
CATCATCCTA 
GGTGCTATGT 
TCTTCCAAGG 
AAATAACTAA 
CTTCCACTGC 
CAAACACATA 
GAAAGACTGT 
CATGGAATAA 
ACAGATAGAT 
AAAATGAGGT 



21 
I 

AGACATATTC 
GATTCTGATT 
TAGAACCGTA 
AGAAAAACTT 
AATGAAAAAG 
ACTGAAAGCA 
ATCGATGCAG 
CATGGAGTAT 
ATGATGGTCA 
AGCTATTCAG 
TCTTAGTGGA 
GTACTAAGGA 
AAGGTATGCA 
CATCCTCCCA 
CAGGAAGGTA 
ACAAAGTATA 
CATGTAATTA 
ATATGCTCTG 
ACTCTCCTGG 



31 

I 

TGAGCCTACA 
TGCTGCCTTA 
CGCTGTACCT 
GAAATTATTC 
AAGGGTGAGA 
GTTAGCAAGG 
TGCTTCCAAG 
ATGTCAAGCC 
CCAAATCAGC 
TAATAACTCT 
TGTTCTGACC 
ATCTTTCTGC 
ATCAAATCTG 
AGGGGCCCAA 
GAAATATCTG 
AGTCTTAGAT 
AGTACTATGT 
CATGTTACAT 
AAATATTAAG 



41 

I 

GCAGAGGAAC 
TCTTTCTGAC 
GCATCAGCAT 
CTGCAAGCCA 
AGAGATGTCT 
AAATGTCTAA 
GATGGACCAC 
ATAATTGTTC 
TGCTACTACT 
ACCCTGGCAC 
CTGCTTCAAA 
TTTGGGGTTT 
CTTTTTAAAG 
ATTCTTTCAG 
AAAATGTATG 
GTATATATTT 
ATCAATGAGT 
AAGATAAATG 



51 
I 

CTCCAGTCTC 
TCTAAGTGGC 
TAGTAATCAA 
ATTTTGTCCA 
GAATCCAGAA 
AAGATCTCCT 
ACAGAGGCTG 
TTAGTTTGCA 
CCTGTAGGAA 
TATAATGTAA 
TATTTCCCTC 
ATCAGAATTC 
AATGCTCTTT 
TGGCTACCTA 
TGTAAGTATT 
CCTATATTGT 
AACAGGAAAA 
TGCTGAATGG 



Seq ID NO: 411 Protein sequence 
Protein Accession 8: NP__001556.1 



11 



21 



31 



41 



51 



MNQTAILICC LIFLTLSGIQ GVPLSRTVRC TCISISNQPV NPRSUEKLEI IPASQFCPRV 
EIIATMKKKG EKRCLNPESK AIKNLLKAVS KEMSKRSP 

Seq ID NO: 412 DNA sequence 
Nucleic Acid Accession #: XM_057014 
Coding sequence: 143 . .874 



GGGAGGGAGA 
CGCGGCGGAG 
CGCTGCCCGG 
CCGCGGCCTC 
CCCCAAGGGG 
AATGTGCTTA 
CATTCCGGGT 
TCTGAGGGAA 
ATTGAATTAT 
AAATAGTGCT 
CTGTCAGCGT 
AGCTATAATT 
CACTTCTTCT 
CTGGGTTGGC 
TTCTCGCATC 
TTTTTTTATT 
CATCTGAATG 
TTTAAATCTA 
TGGTTAGAAT 
GGTCTTTTGT 
TGTACAATTT 
CAACCTTAAA 



11 
I 

GAGGCGCGCG 
CCAGACGCTG 
CAGCCGGGAG 
CTGCTGCTCC 
AAGCAAAAGG 
CAAGGGCCAG 
ACACCTGGGA 
AGCTTTGAGG 
GGCATAGATC 
CTAAGAGTTT 
TGGTATTTCA 
TATTTGGACC 
GTGGAAGGAC 
ACTTGTTCAG 
ATTATTGAAG 
ATGCCTTGGA 
AAAAGCAAAG 
GCATTATTCA 
ACTTTCTTCA 
TTTTTCTCTT 
GTAAATGTTA 
AAAAAAAAAA 



21 
I 

GGTGAAAGGC 
ACCACGTTCC 
CCATGCGACC 
TGCTGCTGCA 
CGCAGCTCCG 
CAGGAGTGCC 
TCCCAGGTCG 
AGTCCTGGAC 
TTGGGAAAAT 
TGTTCAGTGG 
CATTCAATGG 
AAGGAAGCCC 
TTTGTGAAGG 
ATTACCCAAA 
AACTACCAAA 
ATGGTTCACT 
CTAAATATGT 
TTTTGCTTCA 
TAGTCACATT 
AGTATAGCAT 
AGAATTTTTT 



31 
I 

GCATTGATGC 
TCTCCTCGGT 
CCAGGGCCCC 
GCTGCCCGCG 
GCAGAGGGAG 
TGGTCGAGAC 
GGATGGATTC 
ACCCAACTAC 
TGCGGAGTGT 
CTCACTTCGG 
AGCTGAATGT 
TGAAATGAAT 
AATTGGTGCT 
AGGAGATGCT 
ATAAATGCTT 
TAAATGACAT 
TTACAGACCA 
ATCAAAAGTG 
CTCTCAACCT 
TTTTAAAAAA 
TTATATCTGT 



41 

I 

AGCCTGCGGC 
CTCCTCCGCC 
GCCGCCTCCC 
CCGTCGAGCG 
GTGGTGGACC 
GGGAGCCCTG 
AAAGGAGAAA 
AAGCAGTGTT 
ACATTTACAA 
CTAAAATGCA 
TCAGGACCTC 
TCAACAATTA 
GGATTAGTGG 
TCTACTGGAT 
TAATTTTCAT 
TTTAAATAAG 
AAGTGTGATT 
GTTTCAATAT 
ATAATTTGGA 
ATATAAAAGC 
TAAATAAAAA 



51 

I 

GGCCTCGGAG 
TCCAGCTCCG 
CGCAGCGGCT 
CCTCTGAGAT 
TGTATAATGG 
GGGCCAATGG 
AGGGGGAATG 
CATGGAGTTC 
AGATGCGTTC 
GAAATGCATG 
TTCCCATTGA 
ATATTCATCG 
ATGTTGCTAT 
GGAATTCAGT 
TTGCTACCTC 
TTTATGTATA 
TCACACTGTT 
TTTTTTTAGT 
ATATTGTTGT 
TACCAATCTT 
TTATTTCCAA 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
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60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



60 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 



Seq ID NO: 413 Protein sequence 
Protein Accession #i XP_057014 

1 11 21 31 41 51 

1)1111 
MRPQGPAASP QRLRGIiLLLL LLQLPAPSSA SEIPKGKQKA QLRQREWDL YNGMCLQGPA 
GVPGRDGSPG ANGIPGTPGI PGRDGFKGEK GECLRESFEE SWTPNYKQCS WSSLNYGIDL 
GKIAECTFTK MRSNSALRVL FSGSLRLKCR NACCQRWYFT FNGAECSGPL PIEAIIYLDQ 
GSPEMNSTIN IHRTSSVEGL CEGIGAGLVD VAIWVGTCSD YPKGDASTGW NSVSRIIIEE 
LPK 



60 
120 
180 
240 



337 



WO 02/086443 

Seq ID NO: 414 DNA sequence 
Nucleic Acid Accession th XM_084007 
Coding sequence: 138.. 2405 

1 11 21 31 41 51 

I I 1 I I I 

CTCGTGCCGA ATTCGGCACG AGACCGCGTG TTCGCGCCTG GTAGAGATTT CTCGAAGACA 60 

CCAGTGGGCC CGTGTGGAAC CAAACCTGCG CGCGTGGCCG GGCOGTGGGA CAACGA66CC 120 

GCGGAGAOGA AGGCGCAATG GCGAGGAAGT TATCTGTAAT CTTGATCCTG ACCTTTGCCC 180 

TCTCTGTCAC AAATCCCCTT CATGAACTAA AAGCAGCTGC TTTCCCCCAG ACCACTGAGA 240 

AAATTAGTCC GAATTGGGAA TCTGGCATTA ATGTTGACTT GGCAATTTCC ACACGGCAAT 300 

ATCATCTACA ACAGCTTTTC TACCGCTATG GAGAAAATAA TTCTTTGTCA GTTGAAGGGT 360 

TCAGAAAATT ACTTCAAAAT ATAGGCATAG ATAAGATTAA AAGAATCCAT ATACACCATG 420 

ACCACGACCA TCACTCAGAC CACGAGCATC ACTCAGACCA TGAGCGTCAC TCAGACCATG 480 

AGCATCACTC AGACCACGAG CATCACTCTG ACCATGATCA TCACTCCCAC CATAATCATG 540 

CTGCTTCTGG TAAAAATAAG CGAAAAGCTC TTTGCCCAGA CCATGACTCA GATAGTTCAG 600 

GTAAAGATCC TAGAAACAGC CAGGGGAAAG GAGCTCACCG ACCAGAACAT GCCAGTGGTA 660 

GAAGGAATGT CAAGGACAGT GTTAGTGCTA GTGAAGTGAC CTCAACTGTG TACAACACTG 720 

TCTCTGAAGG AACTCACTTT CTAGAGACAA TAGAGACTCC AAGACCTGGA AAACTCTTCC 780 

CCAAAGATGT AAGCAGCTCC ACTCCACCCA GTGTCACATC AAAGAGCCGG GTGAGCCGGC 840 

TGGCTGGTAG GAAAACAAAT GAATCTGTGA GTGAGCCCOG AAAAGGCTTT ATGTATTCCA 900 

GAAACACAAA TGAAAATCCT CAGGAGTGTT TCAATGCATC AAAGCTACTG ACATCTCATG • 960 

GCATGGGCAT CCAGGTTCCG CTGAATGCAA CAGAGTTCAA CTATCTCTGT CCAGCCATCA 1020 

TCAACCAAAT TGATGCTAGA TCTTGTCTGA TTCATACAAG TGAAAAGAAG GCTGAAATCC 1080 

CTCCAAAGAC CTATTCATTA CAAATAGCCT GGGTTGGTGG TTTTATAGCC ATTTCCATCA 1140 

TCAGTTTCCT GTCTCTGCTG GGGGTTATCT TAGTGCCTCT CATGAATCGG GTGTTTTTCA 1200 

AATTTCTCCT GAGTTTCCTT GTGGCACTGG CCGTTGGGAC TTTGAGTGGT GATGCTTTTT 1260 

TACACCTTCT- TCCACATTCT CATGCAAGTC ACCACCATAG TCATAGCCAT GAAGAACCAG 1320 

CAATGGAAAT GAAAAGAGGA CCACTTTTCA GTCATCTGTC TTCTCAAAAC ATAGAAGAAA 1380 

GTGCCTATTT TGATTCCACG TGGAAGGGTC TAACAGCTCT AGGAGGCCTG TATTTCATGT 1440 

TTCTTGTTGA ACATGTCCTC ACATTGATCA AACAATTTAA AGATAAGAAG AAAAAGAATC 1500 

AGAAGAAACC TGAAAATGAT GATGATGTGG AGATTAAGAA GCAGTTGTCC AAGTATGAAT 1560 

CTCAACTTTC AACAAATGAG GAGAAAGTAG ATACAGATGA TCGAACTGAA GGCTATTTAC 1620 

GAGCAGACTC ACAAGAGCCC TCCCACTTTG ATTCTCAGCA GCCTGCAGTC TTGGAAGAAG 1680 

AAGAGGTCAT GATAGCTCAT GCTCATCCAC AGGAAGTCTA CAATGAATAT GTACCCAGAG 1740 

GGTGCAAGAA TAAATGCCAT TCACATTTCC ACGATACACT CGGCCAGTCA GACGATCTCA 1800 

TTCACCACCA TCATGACTAC CATCATATTC TCCATCATCA CCACCACCAA AACCACCATC 1860 

CTCACAGTCA CAGCCAGCGC TACTCTCGGG AGGAGCTGAA AGATGCCGGC GTCGCCACTT 1920 

TGGCCTGGAT GGTGATAATG GGTGATGGCC TGCACAATTT CAGCGATGGC CTAGCAATTG 1980 

GTGCTGCTTT TACTGAAGGC TTATCAAGTG GTTTAAGTAC TTCTGTTGCT GTGTTCTGTC 2040 

ATGAGTTGCC TCATGAATTA GGTGACTTTG CTGTTCTACT AAAGGCTGGC ATGACCGTTA 2100 

AGCAGGCTGT CCTTTATAAT GCATTGTCAG CCATGCTGGC GTATCTTGGA ATGGCAACAG 2160 

GAATTTTCAT TGGTCATTAT GCTGAAAATG TTTCTATGTG GATATTTGCA CTTACTGCTG 2220 

GCTTATTCAT GTATGTTGCT CTGGTTGATA TGGTACCTGA AATGCTGCAC AATGATGCTA 2280 

GTGACCATGG ATGTAGCCGC TGGGGGTATT TCTTTTTACA GAATGCTGGG ATGCTTTTGG 2340 

GTTTTGGAAT TATGTTACTT ATTTCCATAT TTGAACATAA AATCGTGTTT CGTATAAATT 2400 

TCTAGTTAAG GTTTAAATGC TAGAGTAGCT TAAAAAGTTG TCATAGTTTC AGTAGGTCAT 2460 

AGGGAGATGA GTTTGTATGC TGTACTATGC AGCGTTTAAA GTTAGTGGGT TTTGTGATTT 2520 

TTGTATTGAA TATTGCTGTC TGTTACAAAG TCAGTTAAAG GTACGTTTTA ATATTTAAGT 2580 

TATTCTATCT TGGAGATAAA ATCTGTATGT GCAATTCACC GGTATTACCA GTTTATTATG 2640 

TAAACAAGAG ATTTGGCATG ACATGTTCTG TATGTTTCAG GGAAAAATGT CTTTAATGCT 2700 

TTTTCAAGAA CTAACACAGT TATTCCTATA CTGGATTTTA GGTCTCTGAA GAACTGCTGG 2760 

TGTTTAGGAA TAAGAATGTG CATGAAGCCT AAAATACCAA GAAAGCTTAT ACTGAATTTA 2820 

AGCAAAGAAA TAAAGGAGAA AAGAGAAGAA TCTGAGAATT GGGGAGGCAT AGATTCTTAT 2880 

AAAAATCACA AAATTTGTTG TAAATTAGAG GGGAGAAATT TAGAATTAAG TATAAAAAGG 2940 

CAGAATTAGT ATAGAGTACA TTCATTAAAC ATTTTTGTCA GGATTATTTC CCGTAAAAAC 3000 

GTAGTGAGCA CTCTCATATA CTAATTAGTG TACATTTAAC TTTGTATAAT ACAGAAATCT 3060 

AAATATATTT AATGAATTCA AGCAATATAC ACTTGACCAA GAAATTGGAA TTTCAAAATG 3120 

TTCGTGCGGG TTATATACCA GATGAGTACA GTGAGTAGTT TATGTATCAC CAGACTGGGT 3180 

TATTGCCAAG TTATATATCA CCAAAAGCTG TATGACTGGA TGTTCTGGTT ACCTGGTTTA 3240 

CAAAATTATC AGAGTAGTAA AACTTTGATA TATATGAGGA TATTAAAACT ACACTAAGTA 3300 

TCATTTGATT CGATTCAGAA AGTACTTTGA TATCTCTCAG TGCTTCAGTG CTATCATTGT 3360 

GAGCAATTGT CTTTATATAC GGTACTGTAG CCATACTAGG CCTGTCTGTG GCATTCTCTA 3420 
GATGTTTCTT TTTTACACAA TAAATTCCTT ATATCAGCTT G 
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MARKLSVILI LTFALSVTNP LHELKAAAFP QTTEKISPNW ESGINVDLAI STRQYHLQQL 60 

FYRYGENNSL SVEGFRKLLQ NIGIDKIKRI HIHHDHDHHS DHEHHSDHER HSDHEHHSDH 120 

EHHSDHDHHS HHNHAASGKN KRKALCPDHD SDSSGKDPRN SQGKGAHRPE HASGRRNVKD 1B0 

SVSASEVTST VYNTVSEGTH FLETIETPRP GKLFPKDVSS STPPSVTSKS RVSRLAGRKT 240 

NESVSEPRKG FMYSRNTNEN PQECFNASKL LTSHGKGIQV PLNATEFNYL CPAIINQIDA 300 

RSCLIHTSEK KAEIPPKTYS LQIAWVGGFI AISIISFLSI* LGVILVPLMN RVFFKFLLSP 360 

LVALAVGTLS GDAFLHLLPH SHASHHHSHS HEEPAMEMKR GPLFSKLSSQ NIEESAYFDS 420 

TWKGLTALGG LYFMFLVEHV LTLIKQFKDK KKKNQKKPEN DDDVEIKKQL SKYESQLSTN 480 

EEKVDTDDRT EGYLRADSQE PSHFDSQQPA VLEEEEVMIA HAHPQEVYNE YVPRGCKNKC 540 

HSHFHDTLGQ SDDLIHHHHD YHHILHHHHH QNHHPHSHSQ RYSREEDKDA GVATLAWMVI 600 

MGDGLHNFSD GLAIGAAFTE GLSSGLSTSV AVFCHELPHE LGDFAVLLKA GMTVKQAVLY 660 

NALSAMLAYL GMATGIFIGH YAENVSMWIF ALTAGLFMYV ALVDMVPEML HNDASDEGCS 720 
RWGYFFLQNA GMLLGFGIML LISIFEHKIV FRINF 

Seq ID NO: 416 DNA sequence 
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ATGCCCAAGC GCGCGCACTG GGGGGCCCTC TCCGTGGTGC TGATCCTGCT TTGGGGCCAT 60 

5 CCGCGAGTGQ CGCTGGCCTG CCCGCATCCT TGTGCCTGCT ACGTCCCCAG CGAGGTCCAC 120 

TGCACGTTCC GATCCCTGGC TTCCGTGCCC GCTGGCATTG CTAGACACGT GGAAAGAATC 180 

AATTTGGGGT TTAATAGCAT ACAGGCCCTG TCAGAAACCT CATTTGCAGG ACTGACCAAG 240 

TTGGAGCTAC 7TATGATTCA CGGCAATGAG ATCCCAAGCA TCCCOGATGG AGCTTTAAGA 300 

GACCTCAGCT CTCTTCAGGT TTTCAAGTTC AGCTACAACA AGCTGAGAGT GATCACAGGA 360 

10 CAGACCCTCC AGGGTCTCTC TAACTTAATG AGGCTGCACA TTGACCACAA CAAGATCGAG 420 

TTTATCCACC CTCAAGCTTT CAACGGCTTA ACGTCTCTGA GGCTACTCCA TTTGGAAGGA 480 

AATCTCCTCC ACCAGCTGCA CCCCAGCACC TTCTCCACGT TCACATTTTT GGATTATTTC 540 

AGACTCTCCA CCATAAGGCA CCTCTACTTA GCAGAGAACA TGGTTAGAAC TCTTCCTGCC 600 

AGCATGCTTC GGAACATGCC GCTTCTGGAG AATCTTTACT TGCAGGGAAA TCCGTGGACC 660 

15 TGCGATTGTG AGATGAGATG GTTTTTGGAA TGGGATGCAA AATCCAGAGG AATTCTGAAG 720 

TGTAAAAAGG ACAAAGCTTA TGAAGGCGGT CAGTTGTGTG CAATGTGCTT CAGTCCAAAG 780 

AAGTTGTACA AACATGAGAT ACACAAGCTG AAGGACATGA CTTGTCTGAA GCCTTCAATA 840 

GAGTCCCCTC TGAGACAGAA CAGGAGCAGG AGTATTGAGG AGGAGCAAGA ACAGG AAGA Q 900 

GATGGTGGCA GCCAGCTCAT CCTGGAGAAA TTCCAACTGC CCCAGTGGAG CATCTCTTTG 960 

20 AATATGACCG ACGAGCACGG GAACATGGTG AACTTGGTCT GTGACATCAA GAAACCAATG 1020 

GATGTGTACA AGATTCACTT GAACCAAACG GATCCTCCAG ATATTGACAT AAATGCAACA 1080 

GTTGCCTTGG ACTTTGAGTG TCCAATGACC COAGAAAACT ATGAAAAGCT ATGGAAATTG 1140 

ATAGCATACT ACAGTGAAGT TCCCGTGAAG CTACACAGAG AGCTCATGCT CAGCAAAGAC 1200 

CCCAGAGTCA GCTACCAGTA CAGGCAGGAT GCTGATGAGG AAGCTCTTTA CTACACAGGT 1260 

25 GTGAGAGCCC AGATTCTTGC AGAACCAGAA TGGGTCATGC AGCCATCCAT AGATATCCAG 1320 

CTGAACCGAC GTCAGAGTAC GGCCAAGAAG GTGCTACTTT CCTACTACAC CCAGTATTCT 13 80 

CAAACAATAT CCACCAAAGA TACAAGGCAG GCTCGGGGCA GAAGCTGGGT AATGATTGAG 1440 

CCTAGTGGAG CTGTGCAAAG AGATCAGACt GTCCTGGAAG GGGGTCCATG CCAGTTGAGC 1500 

TGCAACGTGA AAGCTTCTGA GAGTCCATCT ATCTTCTGGG TGCTTCCAGA TGGCTCCATC 1560 

30 CTGAAAGCGC CCATGGATGA CCCAGACAGC AAGTTCTCCA TTCTCAGCAG TGGCTGGCTG 1620 

AGGATCAAGT CCATGGAGCC ATCTGACTCA GGCTTGTACC AGTGCATTGC TCAAGTGAGG 1680 

GATGAAATGG ACCGCATGGT ATATAGGGTA CTTGTGCAGT CTCCCTCCAC TCAGCCAGCC 1740 

GAGAAAGACA CAGTGACAAT TGGCAAGAAC CCAGGGGAGT CGGTGACATT GCCTTGCAAT 1800 

GCTTTAGCAA TACCCGAAGC CCACCTTAGC TGGATTCTTC CAAACAGAAG GATAATTAAT 1860 

35 GATTTGGCTA ACACATCACA TGTATACATG TTGCCAAATG GAACTCTTTC CATCCCAAAG 1920 

GTCCAAGTCA GTGATAGTGG TTACTACAGA TGTGTGGCTG TCAACCAGCA AGGGGCAGAC 1980 

CATTTTACGG TGGGAATCAC AGTGACCAAG AAAGGGTCTG GCTTGCCATC CAAAAGAGGC 2040 

AGACGCCCAG GTGCAAAGGC TCTTTCCAGA GTCAGAGAAG ACATCGTGGA GGATGAAGGG 2100 

GGCTCGGGCA TGGGAGATGA AGAGAACACT TCAAGGAGAC TTCTGCATCC AAAGGACCAA 2160 

40 GAGGTGTTCC TCAAAACAAA GGATGATGCC ATCAATGGAG ACAAGAAAGC CAAGAAAGGG 2220 

AGAAGAAAGC TGAAACTCTG GAAGCATTCG GAAAAAGAAC CAGAGACCAA TGTTGCAGAA 2280 

GGTCGCAGAG TGTTTGAATC TAGACGAAGG ATAAACATGG CAAACAAACA GATTAATCCG 2340 

GAGCGCTGGG CTGATATTTT AGCCAAAGTC CGTGGGAAAA ATCTCCCTAA GGGCACAGAA 2400 

GTACCCCCGT TGATTAAAAC CACAAGTCCT CCATCCTTGA GCCTAGAAGT CACACCACCT 2460 

45 TTTCCTGCTG TTTCTCCCCC CTCAGCATCT CCTGTGCAGA CAGTAACCAG TGCTGAAGAA 2520 

TCCTCAGCAG ATGTACCTCT ACTTGGTGAA GAAGAGCACG TTTTGGGTAC CATTTCCTCA 2580 

GCCAGCATGG GGCTAGAACA CAACCACAAT GGAGTTATTC TTGTTGAACC TGAAGTAACA 2640 

AGCACACCTC TGGAGGAAGT TGTTGATGAC CTTTCTGAGA AGACTGAGGA GATAACTTCC 2700 

ACTGAAGGAG ACCTGAAGGG GACAGCAGCC CCTACACTTA TATCTGAGCC TTATGAACCA 2760 

50 TCTCCTACTC TGCACACATT AGACACAGTC TATGAAAAGC CCACCCATGA AGAGACGGCA 2820 

ACAGAGGGTT GGTCTGCAGC AGATGTTGGA TCGTCACCAG AGCCCACATC CAGTGAGTAT 2880 

GAGCCTCCAT TGGATGCTGT CTCCTTGGCT GAGTCTGAGC CCATGCAATA CTTTGACCCA 2940 

GATTTGGAGA CTAAGTCACA ACCAGATGAG GATAAGATGA AAGAAGACAC CTTTGCACAC 3000 

CTTACTCCAA CCCCCACCAT CTGGGTTAAT GACTCCAGTA CATCACAGTT ATTTGAGGAT 3060 

55 TCTACTATAG GGGAACCAGG TGTCCCAGGC CAATCACATC TACAAGGACT GACAGACAAC 3120 

ATCCACCTTG TGAAAAGTAG TCTAAGCACT CAAGACACCT TACTGATTAA AAAGGGTATG 3180 

AAAGAGATGT CTCAGACACT ACAGGGAGGA AATATGCTAG AGGGAGACCC CACACACTCC 3240 

AGAAGTTCTG AGAGTGAGGG CCAAGAGAGC AAATCCATCA CTTTGCCTGA CTCCACACTG 3300 

GGTATAATGA GCAGTATGTC TCCAGTTAAG AAGCCTGCGG AAACCACAGT TGGTACCCTC 3360 

60 CTAGACAAAG ACACCACAAC AGTAACAACA ACACCAAGGC AAAAAGTTGC TCCGTCATCC 3420 

ACCATGAGCA CTCACCCTTC TCGAAGGAGA CCCAACGGGA GAAGGAGATT ACGCCCCAAC 3480 

AAATTCCGCC ACCGGCACAA GCAAACCCCA CCCACAACTT TTGCCCCATC AGAGACTTTT 3540 

TCTACTCAAC CAACTCAAGC ACCTGACATT AAGATTTCAA GTCAAGTGGA GAGTTCTCTG 3600 

GTTCCTACAG CTTGGGTGGA TAACACAGTT AATACCCCCA AACAGTTGGA AATGGAGAAG 3660 

65 AATGCAGAAC CCACATCCAA GGGAACACCA CGGAGAAAAC ACGGGAAGAG GCCAAACAAA 3720 

CATCGATATA CCCCTTCTAC AGTGAGCTCA AGAGCGTCCG GATCCAAGCC CAGCCCTTCT 3780 

CCAGAAAATA AACATAGAAA CATTGTTACT CCCAGTTCAG AAACTATACT TTTGCCTAGA 3840 

ACTGTTTCTC TGAAAACTGA GGGCCCTTAT GATTCCTTAG ATTACATGAC AACCACCAGA 3900 

AAAATATATT CATCTTACCC TAAAGTCCAA GAGACACTTC CAGTCACATA TAAACCCACA 3960 

70 TCAGATGGAA AAGAAATTAA GGATGATGTT GCCACAAATG TTGACAAACA TAAAAGTGAC 4020 

ATTTTAGTCA CTGGTGAATC AATTACTAAT GCCATACCAA CTTCTCGCTC CTTGGTCTCC 4080 

ACTATGGGAG AATTTAAGGA AGAATCCTCT CCTGTAGGCT TTCCAGGAAC TCCAACCTGG 4140 

AATCCCTCAA GGACGGCCCA GCCTGGGAGG CTACAGACAG ACATACCTGT TACCACTTCT 4200 

GGGGAAAATC TTACAGACCC TCCCCTTCTT AAAGAGCTTG AGGATGTGGA TTTCACTTCC 4260 

75 GAGTTTTTGT CCTCTTTGAC AGTCTCCACA CCATTTCACC AGGAAGAAGC TGGTTCTTCC 4320 

ACAACTCTCT CAAGCATAAA AGTGGAGGTG GCTTCAAGTC AGGCAGAAAC CACCACCCTT 4380 

GATCAAGATC ATCTTGAAAC CACTGTGGCT ATTCTCCTTT CTGAAACTAG ACCACAGAAT 4440 

CACACCCCTA CTGCTGCCCG GATGAAGGAG CCAGCATCCT CGTCCCCATC CACAATTCTC 4500 

ATGTCTTTGG GACAAACCAC CACCACTAAG CCAGCACTTC CCAGTCCAAG AATATCTCAA 4560 

80 GCATCTAGAG ATTCCAAGGA AAATGTTTTC TTGAATTATG TGGGGAATCC AGAAACAGAA 4620 

GCAACCCCAG TCAACAATGA AGGAACACAG CATATGTCAG GGCCAAATGA ATTATCAACA 4680 

CCCTCTTCCG ACCGGGATGC ATTTAACTTG TCTACAAAGC TGGAATTGGA AAAGCAAGTA 4740 

TTTGGTAGTA GGAGTCTACC ACGTGGCCCA GATAGCCAAC GCCAGGATGG AAGAGTTCAT 4800 

GCTTCTCATC AACTAACCAG AGTCCCTGCC AAACCCATCC TACCAACAGC AACAGTGAGG 4860 

85 CTACCTGAAA TGTCCACACA AAGCGCTTCC AGATACTTTG TAACTTCCCA GTCACCTCGT 4920 

CACTGGACCA ACAAACCGGA AATAACTACA TATCCTTCTG GGGCTTTGCC AGAGAACAAA 4980 

CAGTTTACAA CTCCAAGATT ATCAAGTACA ACAATTCCTC TCCCATTGCA CATGTCCAAA 5040 



339 



WO 02/086443 

CCCAGCATTC CTAGTAAGTT TACTGACCGA AGAACTGACC AATTCAATGG TTACTCCAAA 5100 

GTGTTTGGAA ATAACAACAT CCCTGAGGCA AGAAACCCAG TTGGAAAGCC TCCC AGTCCA 5160 

AGAATTCCTC ATTATTCCAA TGGAAGACTC CCTTTCTTTA CCAACAAGAC TCTTTCTTTT 5220 

CCACAGTTGG GAGTCACCCG GAGACCCCAG ATACCCACTT CTCCTGCCCC AGTAATGAGA 5280 

GAGAGAAAAG TTATTCCAGG TTCCTACAAC AGGATACATT CCCATAGCAC CTTCCATCTG 5340 

GACTTTGGCC CTCOGGCACC TCCGTTGTTG CACACTCCGC AGACCACGGG ATCACCCTCA 5400 

ACTAACTTAC AGAATATCCC TATGGTCTCT TCCACCCAGA GTTCTATCTC CTTTATAACA 5460 

TCTTCTGTCC AGTCCTCAGG AAGCTTCCAC CAGAGCAGCT CAAAGTTCTT TGCAGGAGGA 5520 

CCTCCTGCAT CCAAATTCTG GTCTCTTGGG GAAAAGCCCC AAATCCTCAC CAAGTCCCCA 5580 

CAGACTGTGT COGTCACCGC TGAGACAGAC ACTGTGTTCC CCTGTGAGGC AACAGGAAAA 5640 

CCAAAGCCTT TCGTTACTTG GACAAAGGTT TCCACAGGAG CTCTTATGAC TCCGAATACC 5700 

AGGATACAAC GGTTTGAGGT TCTCAAGAAC GGTACCTTAG TGATACGGAA GGTTCAAGTA 5760 

CAAGATCGAG GCCAGTATAT GTGCACCGCC AGCAACCTGC ACGGCCTGGA CAGGATGGTG 5820 

GTCTTGCTTT CGGTCACCGT GCAGCAACCT CAAATCCTAG CCTCCCACTA CCAGGACGTC 5880 

ACTGTCTACC TGGGAGACAC CATTGCAATG GAGTGTCTGG CCAAAGGGAC CCCAGCCCCC 5940 

CAAATTTCCT GGATCTTCCC TGACAGGAGG GTGTGGCAAA CTGTGTCCCC CGTGGAGAGC 6000 

CGCATCACCC TGCACGAAAA CCGGACCCTT TCCATCAAGG AGGCGTCCTT CTCAGACAGA 6060 

GGCGTCTATA AGTGCGTGGC CAGCAATGCA GCCGGGGCGG ACAGCCTGGC CATCCGCCTG 6120 

CACGTGGCGG CACTGCCCCC CGTTATCCAC CAGGAGAAGC TGGAGAACAT CTCGCTGCCC 6180 

CCGGGGCTCA GCATTCACAT TCACTGCACT GCCAAGGCTG CGCCCCTGCC CAGCGTGCGC 6240 

TGGGTGCTCG GGGACGGTAC CCAGATCCGC CCCTCGCAGT TCCTCCAOGG GAACTTGTTT 6300 

GTTTTCCCCA ACGGGACGCT CTACATCCGC AACCTCGCGC CCAAGGACAG CGGGCGCTAT 6360 

GAGTGCGTGG CCGCCAACCT GGTAGGCTCC GCGCGCAGGA CGGTGCAGCT GAACGTGCAG 6420 

CGTGCAGCAG CCAACGCGCG CATCACGGGC ACCTCCCCGC GGAGGACGGA CGTCAGGTAC 6480 

GGAGGAACCC TCAAGCTGGA CTGCAGCGCC TCGGGGGACC CCTGGCCGCG CATCCTCTGG 6S40 

AGGCTGCCGT CCAAGAGGAT GATCGACGCG CTCTTCAGTT TTGATAGCAG AATCAAGGTG 66O0 

TTTGCCAATG GGACCCTGGT GGTGAAATCA GTGACGGACA AAGATGCCGG AGATTACCTG 6660 

TGCGTAGCTC GAAATAAGGT TGGTGATGAC TACGTGGTGC TCAAAGTGGA TGTGGTGATG 6720 

AAACCGGCCA AGATTGAACA CAAGGAGGAG AACGACCACA AAGTCTTCTA CGGGGGTGAC 6780 

CTGAAAGTGG ACTGTGTGGC CACCGGGCTT CCCAATCCCG AGATCTCCTG GAGCCTCCCA 6840 

GACGGGAGTC TGGTGAACTC CTTCATGCAG TCGGATGACA GCGGTGGACG CACCAAGCGC 6900 

TATGTCGTCT TCAACAATGG GACACTCTAC TTTAACG AAG TGGGGATGAG GGAGGAAGGA 6960 

GACTACACCT GCTTTGCTGA AAATCAGGTC GGGAAGGACG AGATGAGAGT CAGAGTCAAG 7020 

GTGGTGACAG CGCCCGCCAC CATCOGGAAC AAGACTTACT TGGCGGTTCA GGTGCCCTAT 7080 

GGAGACGTGG TCACTGTAGC CTGTGAGGCC AAAGGAGAAC CCATGCCCAA GGTGACTTGG 7140 

TTGTCCCCAA CCAACAAGGT GATCCCCACC TCCTCTGAGA AGTATCAGAT ATACCAAGAT 7200 

GGCACTCTCC TTATTCAGAA AGCCCAGCGT TCTGACAGCG GCAACTACAC CTGCCTGGTC 7260 

AGGAACAGCG CGGGAGAGGA TAGGAAGACG GTGTGGATTC ACGTCAACGT CCAGCCACCC 7320 

AAGATCAACG GTAACCCCAA CCCCATCACC ACCGTGCGGG AGATAGCAGC CGGGGGCAGT 7380 

CGGAAACTGA TTGACTGCAA AGCTGAAGGC ATCCCCACCC CGAGGGTGTT ATGGGCTTTT 7440 

CCCGAGGGTG TGGTTCTGCC AGCTCCATAC TATGGAAACC GGATCACTGT CCATGGCAAC 7500 

GGTTCCCTGG ACATCAGGAG TTTGAGGAAG AGCGACTCCG TCCAGCTGGT ATGCATGGCA 7560 

CGCAACGAGG GAGGGGAGGC GAGGTTGATC GTGCAGCTCA CTGTCCTGGA GCCCATGGAG 7620 

AAACCCATCT TCCACGACCC GATCAGCGAG AAGATCACGG CCATGGCGGG CCACACCATC 7680 

AGCCTCAACT GCTCTGCCGC GGGGACCCCG ACACCCAGCC TGGTGTGGGT CCTTCCCAAT 7740 

GGCACCGATC TGCAGAGTGG ACAGCAGCTG CAGCGCTTCT ACCACAAGGC TGACGGCATG 7800 

CTACACATTA GCGGTCTCTC CTCGGTGGAC GCTGGGGCCT ACCGCTGCGT GGCCCGCAAT 7860 

GCCGCTGGCC ACACGGAGAG GCTGGTCTCC CTGAAGGTGG GACTGAAGCC AGAAGCAAAC 7920 

AAGCAGTATC ATAACCTGGT CAGCATCATC AATGGTGAGA CCCTGAAGCT CCCCTGCACC 7980 

CCTCCCGGGG CTGGGCAGGG ACGTTTCTCC TGGACGCTCC CCAATGGCAT GCATCTGGAG 8040 

GGCCCCCAAA CCCTGGGAOG CGTTTCTCTT CTGGACAATG GCACCCTCAC GGTTCGTGAG 8100 

GCCTCGGTGT TTGACAGGGG TACCTATGTA TGCAGGATGG AGACGGAGTA CGGCCCTTCG 8160 

GTCACCAGCA TCCCCGTGAT TGTGATCGCC TATCCTCCCC GGATCACCAG CGAGCCCACC 8220 

CCGGTCATCT ACACCCGGCC CGGGAACACC GTGAAACTGA ACTGCATGGC TATGGGGATT 8280 

CCCAAAGCTG ACATCACGTG GGAGTTACCG GATAAGTCGC ATCTGAAGGC AGGGGTTCAG 8340 

GCTCGTCTGT ATGGAAACAG ATTTCTTCAC CCCCAGGGAT CACTGACCAT CCAGCATGCC 8400 

ACACAGAGAG ATGCCGGCTT CTACAAGTGC ATGGCAAAAA ACATTCTCGG CAGTGACTCC 8460 

AAAACAACTT ACATCCACGT CTTCTGAAAT GTGGATTCCA GAATGATTGC TTAGGAACTG 8520 

ACAACAAAGC GGGGTTTGTA AGGGAAGCCA GGTTGGGGAA TAGGAGCTCT TAAATAATGT 8580 

GTCACAGTGC ATGGTGGCCT CTGGTGGGTT TCAAGTTGAG GTTGATCTTG ATCTACAATT 8640 

GTTGGGAAAA GGAAGCAATG CAGACACGAG AAGGAGGGCT CAGCCTTGCT GAGACACTTT 8700 

CTTTTGTGTT TACATCATGC CAGGGGCTTC ATTCAGGGTG TCTGTGCTCT GACTGCAATT 8760 

TTTCTTCTTT TGCAAATGCC ACTCGACTGC CTTCATAAGC GTCCATAGGA TATCTGAGGA 8820 

ACATTCATCA AAAATAAGCC ATAGACATGA ACAACACCTC ACTACCCCAT TGAAGACGCA 8880 

TCACCTAGTT AACCTGCTGC AGTTTTTACA TGATAGACTT TGTTCCAGAT TGACAAGTCA 8940 

TCTTTCAGTT ATTTCCTCTG TCACTTCAAA ACTCCAGCTT GCCCAATAAG GATTTAGAAC 9000 

CAGAGTGACT GATATATATA TATATATTTT AATTCAGAGT TACATACATA CAGCTACCAT 9060 

TTTATATGAA AAAAGAAAAA CATTTCTTCC TGGAACTCAC TTTTTATATA ATGTTTTATA 9120 

TATATATTTT TTCCTTTCAA ATCAGACGAT GAGACTAGAA GGAGAAATAC TTTCTGTCTT 9180 

ATTAAAATTA ATAAATTATT . GGTCTTTACA AGACTTGGAT ACATTACAGC AGACATGGAA 9240 

ATATAATTTT AAAAAATTTC TCTCCAACCT CCTTCAAATT CAGTCACCAC • TGTTATATTA 9300 

CCTTCTCCAG GAACCCTCCA GTGGGGAAGG CTGCGATATT AGATTTCCTT GTATGCAAAG 9360 

TTTTTGTTGA AAGCTGTGCT CAGAGGAGGT GAGAGGAGAG GAAGGAGAAA ACTGCATCAT 9420 

AACTTTACAG AATTGAATCT AGAGTCTTCC CCGAAAAGCC CAGAAACTTC TCTGCAGTAT 9480 

CTGGCTTGTC CATCTGGTCT AAGGTGGCTG CTTCTTCCCC AGCCATGAGT CAGTTTGTGC 9540 

CCATGAATAA TACACGACCT GTTATTTCCA TGACTGCTTT ACTGTATTTT TAAGGTCAAT 9600 
ATACTGTACA TTTGATAATA AAATAATATT CTCCCAAAAA AAAAA 
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MPKRAHWGAL SWLILLWGH PRVALACPHP CACYVPSEVH CTFRSLASVP AGIARHVERI 60 

NLGFNSIQAL SETSFAGLTK LELLMIHGNE IPSIPDGALR DLSSLQVFKP SYNKLRVITG 120 

QTLQGLSNLM RLHIDHNKIE FIHPQAFNGL TSLRLLHLEG NLLHQLHPST FSTFTFLDYF 180 

RLSTIRHLYL AENMVRTLPA SMLRNMPLLE NLYLQGNPWT CDCEMRWFLE WDAKSRGILK 240 
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CKKDKAYEGG QLCAMCFSPK KLYKHEIHKL KDMTCLKPSI ESPLRQNRSR SIEEEQEQEB 300 

DGGSQIilLBK FQLPQWSISL NMTDEHGNMV NLVCDIKKPM DVYKIHLNQT DPPDIDINAT 360 

VALDFECPMT RENYEKLWKL IAYYSEVPVK LHRELMLSKD PRVSYQYRQD ADEEALYYTG 420 

VRAQILAEPE WVMQPSIDIQ LNRRQSTAKK VLLSYYTQYS QTISTKDTRQ ARGRSWVMIB 480 

PSGAVQRDQT VLEGGPCQLS CNVKASESPS IFWVLPDGSI LKAPMDDPDS KFSILSSGWL 540 

RIKSMEPSDS GLYQCIAQVR DEMDRMVYRV LVQSPSTQPA BKDTVTIGKN PGESVTLPCN 600 

ALAIPEAHLS WILPNRRIIN DIiANTSHVYM LPNGTLSIPK VQVSDSGYYR CVAVNQQGAD 660 

HFTVGITVTK KGSGLPSKRG RRPGAKALSR VREDIVEDEG GSGMGDEENT SRRLLHPKDQ 720 

EVFLKTKDDA INGDKKAKKG RRKLKLWKHS EKEPBTNVAE GRRVFESRRR INMANKQINP 780 

ERWADILAKV RGKNLPKGTE VPPLIKTTSP PSLSLEVTPP FPAVSPPSAS PVQTVTSAEB 840 

SSADVPLLGE EEHVLGTISS ASMGLEHNHN GVILVEPEVT STPLEEWDD LSEKTEEITS 900 

TEGDLKGTAA PTLISEPYEP SPTLHTLDTV YEKPTHEETA TEGWSAADVG SSPEPTSSEY 960 

EPPLDAVSLA ESEPMQYFDP DLETKSQPDE DKMKEDTPAH LTPTPTIWVN DSSTSQLFED 1020 

STIGEPGVPG QSHLQGLTDN IHLVKSSLST QDTLLIKKGM KEMSQTLQGG NMLEGDPTHS 1080 

RSSESEGQES KSITLPDSTL GIMSSMSPVK KPAETTVGTL LDKDTTTVTT TPRQKVAPSS 1140 

TMSTHPSRRR PNGRRRLRPN KFRHRHKQTP PTTFAPSETF STQPTQAPDI KIS^QVESSL 1200 

VPTAWVDNTV NTPKQLEMEK NAEPTSKGTP RRKHGKRPNK HRYTPSTVSS RASGSKPSPS 1260 

PENKHRNIVT PSSETILLPR TVSLKTEGPY DSLDYMTTTR KIYSSYPKVQ ETLPVTYKPT 1320 

SDGKEIKDDV ATNVDKHKSD ILVTGESITN AIPTSRSLVS TMGEFKEESS PVGFPGTPTW 1380 

NPSRTAQPGR LQTDIPVTTS GENLTDPPLL KELEDVDFTS EFLSSLTVST PFHQEEAGSS 1440 

TTLSSIKVEV ASSQAETTTL DQDHLETTVA ILLSETRPQN HTPTAARMKE PASSSPSTIL 1500 

MSLGQTTTTK PALPSPRISQ ASRDSKENVF LNYVGNPETE ATPVNNEGTQ HMSGPNELST 1560 

PSSDRDAFNL STKLELEKQV FGSRSLPRGP DSQRQDGRVH ASHQLTRVPA KPILPTATVR 1620 

LPEMSTQSAS RYFVTSQSPR HWTNKPEITT YPSGALPENK QFTTPRLSST TIPLPLHMSK 1680 

PSIPSKFTDR RTDQFNGYSK VFGNNNIPEA RNPVGKPPSP RIPHYSNGRL PFFTNKTLSF 1740 

PQLGVTRRPQ IPTSPAPVMR ERKVIPGSYN RIHSHSTFHL DFGPPAPPLL HTPQTTGSPS 1800 

TNLQNIPMVS STQSSISFIT SSVQSSGSFH QSSSKFFAGG PPASKFWSLG EKPQILTKSP 1860 

QTVSVTAETD TVFPCEATGK PKPFVTWTKV STGALMTPNT RIQRFEVLKN GTLVIRKVQV 1920 

QDRGQYMCTA SNLHGLDRMV VLLSVTVQQP QILASHYQDV TVYLGDT1AM ECLAKGTPAP 1980 

QISWIFPDRR VWQTVSPVES RITLHENRTL SIKEASFSDR GVYKCVASNA AGADSLAIRL 2040 

HVAALPPVIH QEKLENISLP PGLSIHIHCT AKAAPLPSVR WVLGDGTQIR PSQFLHGNLF 2100 

VFPNGTLYIR NLAPKDSGRY ECVAANLVGS ARRTVQLNVQ RAAANARITG TSPRRTDVRY 2160 

GGTLKLDCSA SGDPWPRILW RLPSKRMIDA LFSFDSRIKV FANGTLWKS VTDKDAGDYL 2220 

CVARNKVGDD YWLKVDWM KPAKIEHKEE NDHKVFYGGD LKVDCVATGL PNPEISWSLP 2280 

DGSLVNSFMQ SDDSGGRTKR YWFNNGTLY FNEVGMREEG DYTCFAENQV GKDEMRVRVK 2340 

WTAPATIRN KTYLAVQVPY GDWTVACEA KGEPMPKVTW LSPTNKVIPT SSEKYQIYQD 2400 

GTLLIQKAQR SDSGNYTCLV RNSAGEDRKT VWIHVNVQPP KINGNPNPIT TVREIAAGGS 2460 

RKLIDCKAEG IPTPRVLWAF PEGWLPAPY YGNRITVHGN GSLDIRSLRK SDSVQLVCMA 2520 

RNEGGEARLI VQLTVLEPME KPIFHDPISE KITAMAGHTI SLNCSAAGTP TPSLVWVLPN 2580 

GTDLQSGQQL QRFYHKADGM LHISGLSSVD AGAYRCVARN AAGHTERLVS LKVGLKPEAN 2640 

KQYHNLVSII NGETLKLPCT PPGAGQGRFS WTLPNGMHLE GPQTLGRVSL LDNGTLTVRE 2700 

ASVFDRGTYV CRMETEYGPS VTSIPVIVIA YPPRITSEPT PVIYTRPGNT VKLNCMAMGI 2760 

PKADITWELP DKSHLKAGVQ ARLYGNRFLH PQGSLTIQHA TQRDAGFYKC MAKNILGSDS 2820 
KTTYIHVF 

Seq ID NO: 418 DNA sequence 

Nucleic Acid Accession ft: Eos sequence 

Coding sequence : 1 . . 5001 

1 11 21 31 41 51 

I II 1 I I 

ATGCCAGGCA CAAAACTAAC CCGAACAGGC GCCCCAGCAG ACTACAGAGT GATATTGAAG 60 

ACCTCTCAAG AGGACGAATT GGATGTACCT GACGACATCA GCGTCCGGGT TATGTCATCT 120 

CAGTCTGTGC TTGTGTCCTG GGTGGATCCT GTTCTGGAAA AACAGAAGAA AGTTGTTGCA 180 

TCAAGACAGT ACACCGTGCG CTATCGAGAG AAGGGGGAAT TGGCCAGGTG GGATTATAAG 240 

CAGATCGCTA ACAGGCGTGT GCTGATTGAG AACCTGATTC CAGACACTGT GTATGAATTT 300 

GCAGTCCGTA TTTCACAGGG TGAAAGAGAT GGCAAATGGA GTACGTCAGT CTTCCAAAGA 360 

ACACCAGAAT CTGCCCCTAC CACAGCTCCT GAAAACTTGA ACGTCTGGCC AGTCAATGGC 420 

AAACCTACAG TTGTCGCTGC ATCTTGGGAT GCGCTACCAG AGACTGAGGG GAAAGTGAAA 480 

GTCTGTCTGC TGGACACAGG ACTGTTTTCA GTTTCCTCCT TCCAACCATC TGCCAAATCA 540 

TTTCAGAATA CATTCTTTCA TACGCCCCGG CTCTCAAACC ATTTGGAGCA AAGTCCCTCA 600 

CCTATCCTGG AGACACTACT TCTGCCCTGG TGGATGGTCT GCAGCCTGGG GAACGCTATC 660 

TTTTCAAAAT CCGGGCCACA AACAGGAGAG GCCTGGGACC TCACTCCAAA GCCTTCATTG 720 

TCGCTATGCC AACAAGAATG CAGCTGTACC CAGAAGGATT TCAGTTGTCT AGCTTACCTG 780 

ATCGATATCC AAACCAAACA AGTTAATAAA GATCCACAAC TGGAAGGGAG TGTTTTTGGA 840 

CCATGTTTTC TTTTCTACTT CCTCACATTT ATGCTGGATA TTGGCGGCTT TTCCTTCATT 900 

ATGTGCTATG AAGACCCANN TGTTTCTTCT TTGACAGGCA ATTCTTTAAA ATCTGTTGCA 960 

GCCAGTAAGG CGGATGTTCA GCAGAACACG GAGGACAATG GGAAACCCGA AAAACCTGAG 1020 

CCTTCCTCAC CTTCTCCCAG AGCTCCAGCT TCCTCCCAAC ACCCCTCTGT GCCTGCTTCT 1080 

CCCCAAGGGA GAAATGCCAA GGACCTTCTT CTTGACTTGA AGAACAAAAT ATTGGCTAAT 1140 

GGTGGGGCGC CCCGAAAACC CCAGCTTCGC GCCAAGAAGG CAGAGGAGCT GGATCTTCAG 1200 

TCGACAGAAA TCACTGGGGA GGAGGAGCTG GGTTCCCGGG AGGACTCGCC CATGTCACCC 1260 

TCAGACACCC AAGACCAGAA ACGGACCCTG AGGCCGCCAA GTAGACACGG CCACTCGGTG 1320 

GTTGCTCCCG GCAGGACTGC AGTGAGGGCC CGGATGCCAG CGCTGCCCCG AAGGGAAGGC 1380 

GTAGATAAGC CTGGCTTTTC CCTGGCCACG CAGCCCCGCC CAGGGGCGCC CCCCTCGGCT 1440 

TCGGCCTCTC CTGCCCACCA CGCGTCCACC CAGGGCACCT CTCATCGTCC TTCCCTGCCT 1500 

GCCAGCTTGA ATGACAACGA CTTGGTGGAC TCAGACGAAG ATGAGCGCGC TGTGGGCTCC 1560 

CTCCACCCCA AGGGCGCCTT CGCCCAGCCC CGGCCAGCCC TGTCCCCCAG CCGCCAGTCC 1620 

CCGTCCAGCG TTCTCCGCGA CAGAAGCTCT GTGCACCCCG GCGCAAAGCC AGCCTCGCCG 1680 

GCGCGGAGGA CCCCCCATTC AGGGGCCGCA GAGGAAGATT CCAGTGCCTC AGCCCCACCC 1740 

TCAAGACTTT CTCCACCCCA TGGGGGATCA TCTCGGCTGC TGCCCACCCA GCCACACCTG 1800 

AGCTCTCCAC TTTCCAAGGG CGGGAAGGAT GGTGAGGACG CCCCAGCCAC CAACTCCAAT 1860 

GCGCCATCAC GGTCCACCAT GTCCTCCTCC GTCTCTTCTC ATCTCTCGTC CAGGACGCAG 1920 

GTCTCTGAGG GAGCGGAGGC TTCTGATGGT GAAAGCCACG GTGACGGCGA TAGGGAAGAC 1980 

GGCGGAAGGC AGGCGGAGGC CACGGCCCAG ACGCTGCGGG CCCGGCCTGC CTCTGGACAC 2040 

TTCCATTTGC TCAGACACAA ACCCTTTGCT GCCAACGGGA GGTCTCCAAG CAGGTTCAGC 2100 

ATTGGGCGGG GACCTCGGCT GCAGCCCTCC AGCTCCCCAC AGTCGACTGT GCCCTCCCGA 2160 
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GCCCACCCCA GGGTTCCCTC TCACTCTGAT TCCCACCCTA AGCTTAGCTC AGGTATCCAT 2220 

GGAGAOGAGG AGGATGAGAA GCCGCTTCCT GCCACOGTTG TCAATGACCA CGTGCCTTCC 2 2 BO 

TCCTCCAGGC AGCCCATCTC CCGGGGCTGG GAGGACTTAA GGAGAAGCCC GCAGAGAGGG 2340 

GCCAGCCTGC ATOGGAAGGA ACCCATCCCA GAGAACCCCA AATCCACAGG GGCAGATACA 2400 

CATCCTCAGG GCAAGTACTC CTCCCTGGCC TCCAAGGCTC AGGATGTTCA ACAGAGCACA 2460 

GACGCGGACA CGGAGGGTCA TTCTCCCAAA GCACAGCCAG GGTCCACAGA CCGCCACGOG 2520 

TCCCCTGCTC GTCCTCCCGC AGCACGGTCA CAGCAGCATC CCAGTGTTCC CAGAAGGATG 2580 

AGACCOGGCC GGGCCCCAGA ACAGCAGCCC CCTCCTCCCG TCGCCACGTC CCAGCACCAC 2640 

CCGGGACCCC AGAGCAGAGA CGCGGGTCGG TCACCTTCCC AGCCCAGGCT CTCACTGACC 2700 

CAGGCOGGGC GGCCCCGCCC CACGTOGCAG GGCCGCTCCC ACTCCTCCTC GGACCCTTAC 2760 

ACGGCGAGCT CCAGAGGGAT GCTCCCCACG GCCCTCCAGA ACCAGGACGA GGATGCCCAG 2820 

GGCAGCTACG ACGACGACAG CACAGAAGTC GAGGCCCAGG ATGTGOGGGC CCCCGCGCAC 2880 

GCCGCGCGCG CCAAGGAGGC AGCTGCGTCC CTTCCCAAGC ACCAGCAGGT GGAGTCTCCC 2940 

ACAGGCGCAG GGGCAGGTGG OGACCACAGG TCCCAGOGCG GACATGCGGC CTCCCCCGCC 3000 

AGGCCCAGCC GACCCGGCGG CCCCCAGTCC CGCGCCCGGG TCCCCAGCAG GGCAGCGCCG 3060 

GGGAAGTCGG AGCCTCCTTC CAAGCGGCCC CTGTCCTCCA AGTCCCAGCA GTCGGTCTCA 3X20 

GCCGAGGACG AGGAGGAGGA GGACGOGGGG TTTTTTAAAG GCGGGAAAGA AGACCTTCTG 3180 

TCTTCCTCTG TGCCAAAGTG GCCCTCTTCC TCCACTCCCA GGGGCGGCAA AGACGCCGAT 3240 

GGGAGCCTCG CCAAGGAAGA GAGGGAGCCT GCCATCGCGC TTGCCCCTCG CGGAGGGAGC 3300 

CTGGCTCCTG TGAAGCGACC TCTCCCCCCA CCTCCAGGCA GCTCCCCCAG GGCCTCCCAC 3360 

GTCCCTTCCC GACCGCCGCC TCGCAGCGCT GCCACCGTGA GCCCCGTCGC GGGCACCCAC 3420 

CCCTGGCCGC GGTACACCAC GCGCGCCCCV CCTGGCCACT TCTCCACCAC CCCGATGCTG 3480 

TCCTTGCGCC AGAGGATGAT GCATGCCAGA TTCCGTAACC CTCTCTCCCG ACAGCCTGCC 3540 

AGACCCTCTT ACAGACAAGG TTATAATGGC AGACCAAATG TAGAAGGGAA AGTCCTTCCT 3600 

GGTAGTAATG GAAAACCGAA TGGACAGAGA ATTATCAATG GCCCTCAAGG AACAAAGTGG 3 660 

GTTGTGGACC TTGATCGTGG GTTAGTATTG AATGCAGAAG GAAGGTACCT CCAAGATTCA 3720 

CATGGAAATC CTCTTCGGAT TAAACTAGGA GGAGATGGTC GAACCATTGT AGATCTGGAA 3780 

GGGACCCCCG TGGTGAGTCC TGACGGCCTC CCACTCTTTG GGCAGGGGCG ACATGGCACA 3840 

CCTCTGGCCA ATGCCCAAGA TAAGCCAATT TTGAGTCTTG GAGGAAAGCC GCTGGTGGGC 3900 

TTGGAGGTCA TCAAAAAAAC CACCCATCCC CCTACCACTA CCATGCAGCC CACCACTACT 3960 

ACGACGCCCC TGCCTACCAC TACAACCCCG AGGCCCACCA CTGCCACCAC CATGCAGCCC 4020 

ACCACTACTA CGACGCCCCT GCCTACCACT ACACCGAGGC CCACCACTGC CACCACCCGC 4080 

CGCACGACCA CCAGGCGTCC AACAACCACA GTCCGAACCA CTACGCGGAC AACCACCACC 4140 

ACCACCCCCA AACCCACCAC TCCCATCCCC ACCTGTCCCC CTGGGACCTT GGAACGGCAC 4200 

GACGATGATG GCAACCTGAT AATGAGCTCC AATGGGATCC CAGAGTGCTA CGCTGAAGAA 4260 

GATGAGTTCT CAGGCTTGGA GACTGACACT GCAGTACCTA CGGAAGAGGC CTACGTTATA 4320 

TATGATGAAG ATTATGAATT TGAGACGTCA AGGCCACCAA CCACCACTGA GCCTTOGACC 4380 

ACTGCTACCA CACCGAGGGT GATCCCAGAG GAAGGCGCCA TCAGTTCCTT TCCTGAAGAA 4440 

GAATTTGATC TGGCTGGAAG GAAACGATTT GTTGCTCCTT ACGTGACGTA CCTAAATAAA 4500 

GACCCATCAG CCCCGTGCTC TCTGACTGAT GCACTGGATC ACTTCCAAGT GGACAGCCTG 4560 

GATGAAATCA TCCCCAATGA CCTGAAGAAG AGTGATCTGC CTCCCCAGCA TGCTCCCOGC 4620 

AACATCACCG TGGTGGCCGT GGAAGGTTGC CACTCATTTG TCATTGTGGA TTGGGACAAA 4680 

GCCACCCCAG GAGATTTGGT CACAGGTTAT TTGGTTTACA GTGCATCCTA TGAAGATTTC 4740 

ATCAGGAACA AGTTTTCCAC TCAAGCTTCA TCAGTAACTC ACTTGCCCAT TGAGAACCTA 4 800 

AAGCCCAACA CGAGGTATTA TTTTAAAGTG CAAGCACAAA ATCCTCATGG CTACGGACCT 4860 

ATCAGCCCTT CGGTCTCATT TGTCACCGAA TCAGATAATC CTCTGCTTGT TGTGAGGCCC 4920 

CCAGGCGGTG AGCTATCTGG ATCCCATTCG CTTTCAAACA TGATCCCAGC TACACGGACT 4980 

GCCATGGACG GCAATATGTG AAGCGCACGT GGTATCGAAA GTTCGTGGGA GTTGTTCTTT 5040 

GTAATTCACT GAGGTATAAA ATCTACCTCA GTGACAACCT GAAAGATACA TTCTACAGCA 5100 

TTGGAGACAG CTGGGGAAGA GGTGAAGACC ATTGCCAATT TGTGGATTCA CACCTTGATG 5160 

GAAGAACAGG GCCTCAGTCC TATGTAGAAG CCCTCCCTAC TATTCAAGGC TACTATCGCC 5220 

AGTATCGTCA GGAGCCTGTC AGGTTTGGGA ACATCGGCTT CGGAACCCCC TACTACTATG 5280 

TGGGCTGGTA CGAGTGTGGG GTCTCCATCC CTGGAAAGTG GTAATCACAG GACCGTCATG 5340 

CTGCAAGCTT GCCCTGCCCA GCCCCACCAA CTAAGTCGCA CTAGGGGCTG TGAGCAAAGA 5400 

CAGCCAGCAT GCTCAGCCCC GCTGCCCTAG GTGCCAGGAA GGTCACAGAT GGACACTGGC 5460 

CATTCTGGTC ATCTCAGTCT GGAACTCAGT CCCACTTCTT GGCCTGGACA ATGAACAGGA 5520 

TTCAGTTTTG CTGTTAACTT TGCTTCTCTA CTTTTTTTTG TTTGTTTGTA ATAGCACATC 5580 

CCAGAGACAT CAGAAACCAG CAACTGATTC AGTGTGATTT CCCAGACTTT TTAGGCATGA 5640 

AATTCGGACA CTTCAGTATT TCCAGGAATA GCATATGCAC GCTGTTCTTG CTTCATGGAA 5700 

TGCTACATGC TTTCTGTTTT TCTCATTTTG GATTTCTCCA AAACTAACTG AATTTAAGCT 5760 

TCAGGTCCCT TTGTATGCAG TAGAAAGGAA TTATTAAAAA CACCACCAAA GAAAATAAAT 5820 

ATATCCTACT TGAAATTTAC TCTATGGACT TACCCACTGC TAGAATAAAT GTATCAAATC 5880 

TTATTTGTAA ATTCTCAATT TTGATATATA TATGTATATA TGCATATACA TATCCACACT 5940 

TGTCTGCAAG AATATTGATT AAAATTGCTA AATTTGTACT TGTTCACCAA AAAAAAAAAA 6000 
AAAAAAA 

Seq ID NO: 419 Protein sequence 
Protein Accession fh Eos sequence 

1 • 11 21 31 41 51 

I i I I I I 

MPGTKLTRTG APADYRVILK TSQEDELDVP DDISVRVMSS QSVLVSWVDP VLEKQKKWA 60 

SRQYTVRYRE KGELARWDYK QIANRRVLIE NLIPDTVYEF AVRISQGERD GKWSTSVFQR 120 

TPESAPTTAP ENLNVWPVNG KPTWAASWD ALPETEGKVK VCLLDTGLFS VSSFQPSAKS 180 

FQNTFPHTPR LSNHLEQSPS PILETLLLPW WMVCSLGNAI FSKSGPQTGE AWDLTPKPSL 240 

SLCQQECSCT QKDFSCLAYL IDIQTKQVNK DPQLEGSVFG PCFLPYFLTF MLDIGGFSFI 300 

MCYEDPVSSL TGNSLKSVAA SKADVQQNTE DNGKPEKPEP SSPSPRAPAS SQHPSVPASP 360 

QGRNAKDLLL DLKNKILANG GAPRKPQLRA KKAEELDLQS TEITGEEELG SREDSPMSPS 420 

DTQDQKRTLR PPSRHGHSW APGRTAVRAR MPALPRREGV DKPGFSIATQ PRPGAPPSAS 480 

ASPAHHASTQ GTSHRPSLPA SLNDNDLVDS DEDERAVGSL HPKGAFAQPR PALSPSRQSP 540 

SSVLRDRSSV HPGAKPASPA RRTPHSGAAE EDS S ASAP PS RLSPPHGGSS RLLPTQPHLS 600 

SPLSKGGKDG EDAPATNSNA PSRSTMSSSV SSHLSSRTQV SEGAEASDGE SKGDGDREDG 660 

GRQAEATAQT LRARPASGHF HLLRHKPFAA KGRSPSRFSI GRGPRI/QPSS SPQSTVPSRA 720 

HPRVPSHSDS HPKLSSGIHG DEEDEKPLPA TWNDHVPSS SRQPISRGWE DLRRSPQRGA 780 

SLHRKEPIPE NPKSTGADTH PQGKYSSLAS KAQDVQQSTD ADTEGHSPKA QPGSTDRHAS 840 

PARPPAARSQ QHPSVPRRMT PGRAPEQQPP PPVATSQHHP GPQSRDAGRS PSQPRLSLTQ 900 

AGRPRPTSQG RSHSSSDPYT ASSRGMLPTA LQNQDEDAQG SYDDDSTEVE AQDVRAPAHA 960 
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ARAXEAAASL PKHQQVESPT GAGAGGDHRS QRGHAASPAR PSRPGGPQSR ARVPSRAAPG 1020 

KSEPPSKRPL SSKSQQSVSA EDEEEEDAOF FKGGKEDLLS SSVPKWPSSS TPRGGKDADG 1080 

SLAKEERBPA IALAPRGGSL APVKRPLPPP PGSSPRASHV PSRPPPRSAA TVSPVAGTHP 1140 

WPRYTTRAPP GHFSTTPMLS LRQRMMHARF RNPLSRQPAR PSYRQGYNGR PNVEGKVLPG 1200 

SNGKPNGQRI INGPQGTKWV VDLDRGLVLN AEGRYLQDSH GNPLRIKLGG PGRTI VDLEG 1260 

TPWSPDGLP LPGQGRHGTP LANAQDKPIL SLGGKPLVGL EVIKKTTHPP rrTMQPTTTT 1320 

TPLPTTTTPR PTTATTMQPT TTTTPLPTTT PRPTTATTRR TTTRRPTTTV RTTTRTTTTT 1380 

TPKPTTPIPT CPPGTLERHD DDGNLIMSSN GIPECYAEED EFSGLETDTA VPTEEAYVIY 1440 

DEDYEFETSR PPTTTEPSTT ATTPRVIPEE GAISSFPEEE FDLAGRKRFV APYVTYLNKD 1500 

PSAPCSLTDA LDHFQVDSLD EIIPNDLKKS DLPPQHAPRN ITWAVEGCH SFVIVDWDKA 1S60 

TPGDLVTGYL VYSASYEDFI RNKFSTQASS VTHLPIENLK PNTRYYFKVQ AQNPHGYGPI 1620 
SPSVSFVTES DNPLLWRPP GGELSGSHSL SNMIPATRTA MDGNM 



Seq ID NO i 420 DNA sequence 
Nucleic Acid Accession #: NM_022743 
Coding sequence: 128.. 1237 

1 11 21 31 41 51 

GTGGATTTTA GAGATACCTC CCCTCCTTCT GCTCAGCTGC CTTGCAGTAA TTAAACTCTT 60 

TCTCTGCTGC AACACCCCTA CTGTTCTCCG TGTATTGGCT TTTCTGGGCA GCAGGAAGGA 120 

AAAGCTGATG CGATGCTCTC AGTGCCGCGT CGCCAAATAC TGTAGTGCTA AGTGTCAGAA 180 

AAAAGCTTGG CCAGACCACA AGCGGGAATG CAAATGCCTT AAAAGCTGCA AACCCAGATA 240 

TCCTCCAGAC TCCGTTCGAC TTCTTGGCAG AGTTGTCTTC AAACTTATGG ATGGAGCACC 300 

TTCAGAATCA GAGAAGCTTT ACTCATTTTA TGATCTGGAG TCAAATATTA ACAAACTGAC - 360 

TGAAGATAAG AAAGAGGGCC TCAGGCAACT CGTAATGACA TTTCAACATT TCATGAGAGA 420 

AGAAATACAG GATGCCTCTC AGCTGCCACC TGCCTTTGAC CTTTTTGAAG CCTTTGCAAA 480 

AGTGATCTGC AACTCTTTCA CCATCTGTAA" TGCGGAGATG CAGGAAGTTG GTGTTGGCCT 540 

ATATCCCAGT ATCTCTTTGC TCAATCACAG CTGTGACCCC AACTGTTCGA TTGTGTTCAA 600 

TGGGCCCCAC CTCTTACTGC GAGCAGTCCG AGACATCGAG GTGGGAGAGG AGCTCACCAT 660 

CTGCTACCTG GATATGCTGA TGACCAGTGA GGAGCGCCGG AAGCAGCTGA GGGACCAGTA 720 

CTGCTTTGAA TGTGACTGTT TCCGTTGCCA AACCCAGGAC AAGGATGCTG ATATGCTAAC 780 

TGGTGATGAG CAAGTATGGA AGGAAGTTCA AGAATCCCTG AAAAAAATTG AAGAACTGAA 840 

GGCACACTGG AAGTGGGAGC AGGTTCTGGC CATGTGCCAG GCGATCATAA GCAGCAATTC 900 

TGAACGGCTT CCCGATATCA ACATCTACCA GCTGAAGGTG CTCGACTGCG CCATGGATGC 960 

CTGCATCAAC CTCGGCCTGT TGGAGGAAGC CTTGTTCTAT GGTACTCGGA CCATGGAGCC 1020 

ATACAGGATT TTTTTCCCAG GAAGCCATCC CGTCAGAGGG GTTCAAGTGA TGAAAGTTGG 1080 

CAAACTGCAG CTACATCAAG GCATGTTTCC CCAAGCAATG AAGAATCTGA GACTGGCTTT 1140 

TGATATTATG AGAGTGACAC ATGGCAGAGA ACACAGCCTG ATTGAAGATT TGATTCTACT 1200 

TTTAGAAGAA TGCGACGCCA ACATCAGAGC ATCCTAAGGG AACGCAGTCA GAGGGAAATA 1260 

CGGCGTGTGT CTTTGTTGAA TGCCTTATTG AGGTCACACA CTCTATGCTT TGTTAGCTGT 1320 

GTGAACCTCT CTTATTGGAA ATTCTGTTCC GTGTTTGTGT AGGTAAATAA AGGCAGACAT 1380 

GGTTTGCAAA CCACAAGAAT CATTAGTTGT AGAGAAGCAC GATTATAATA AATTCAAAAC 1440 
ATTTGGTTGA GGATGCCAAA AAAAAAAAAA AAAAAAA 



Seq ID NO: 421 Protein sequence 
Protein Accession ft: NP_073580 

1 11 21 31 41 51 

| | I I I I 

MRC5QCRVAK YCSAKCQKKA WPDHKRECKC LKSCKPRYPP DSVRLLGRW FKLMDGAPSE 60 

SEKLYSFYDL ESNINKLTED KKEGLRQLVM TFQHFMREEI QDASQLPPAF DLFEAFAKVI 120 

CNSFTICNAE MQEVGVGLYP SISLLNHSCD PNCSIVFNGP HLUjRAVRDI EVGEELTICY 180 

LDMLMTSEER RKQLRDQYCF ECDCFRCQTQ DKDADMLTGD EQVWKEVQES LKKIEELKAH 240 

WKWEQVLAMC QAIISSNSER LPDINIYQLK VLDCAMDACI NLGLLEEALP YGTRTMEPYR 300 

IFFPGSHPVR GVQVMKVGKL QLHQGMFPQA MKNLRLAFDI MRVTUGREHS LIEDLILLLE 360 
E CD AN IRAS 

Seq ID NO: 422 DNA sequence 

Nucleic Acid Accession 8 s NMJ)03014.2 

Coding sequence: 238.-648 

1 11 21 31 41 51 

I I 1 I I I 

GGCGGGTTCG CGCCCCGAAG GCTGAGAGCT GGCGCTGCTC GTGCCCTGTG TGCCAGACGG 60 

CGGAGCTCCG CGGOCGGACC CCGCGGCCCC GCTTTGCTGC CGACTGGAGT TTGGGGGAAG 120 

AAACTCTCCT GCGCCCCAGA AGATTTCTTC CTCGGCGAAG GGACAGCGAA AGATGAGGGT 180 

GGCAGGAAGA GAAGGCGCTT TCTGTCTGCC GGGGTCGCAG CGCGAGAGGG CAGTGCCATG 240 

TTCCTCTCCA TCCTAGTGGC GCTGTGCCTG TGGCTGCACC TGGCGCTGGG CGTGCGCGGC 300 

GCGCCCTGCG AGGCGGTGCG CATCCCTATG TGCCGGCACA TGCCCTGGAA CATCACGCGG 360 

ATGCCCAACC ACCTGCACCA CAGCACGCAG GAGAACGCCA TCCTGGCCAT CGAGCAGTAC 420 

GAGGAGCTGG TGGACGTGAA CTGCAGCGCC GTGCTGCGCT TCTTCTTCTG TGCCATGTAC 480 

GCGCCCATTT GCACCCTGGA GTTCCTGCAC GACCCTATCA AGCCGTGCAA GTCGGTGTGC 540 

CAACGCGCGC GCGACGACTG CGAGCCCCTC ATGAAGATGT ACAACCACAG CTGGCCCGAA 600 

AGCCTGGCCT GCGACGAGCT GCCTGTCTAT GACCGTGGCG TGTGCATTTC GCCTGAAGCC 660 

ATCGTCACGG ACCTCCCGGA GGATGTTAAG TGGATAGACA TCACACCAGA CATGATGGTA 720 

CAGGAAAGGC CTCTTGATGT TGACTGTAAA CGCCTAAGCC CCGATCGGTG CAAGTGTAAA 780 

AAGGTGAAGC CAACTTTGGC AACGTATCTC AGCAAAAACT ACAGCTATGT TATTCATGCC 840 

AAAATAAAAG CTGTGCAGAG GAGTGGCTGC AATGAGGTCA CAACGGTGGT GGATGTAAAA 900 

GAGATCTTCA AGTCCTCATC ACCCATCCCT CGAACTCAAG TCCCGCTCAT TACAAATTCT 960 

TCTTGCCAGT GTCCACACAT CCTGCCCCAT CAAGATGTTC TCATCATGTG TTACGAGTGG 1020 

CGTTCAAGGA TGATGCTTCT TGAAAATTGC TTAGTTGAAA AATGGAGAGA TCAGCTTAGT 1080 

AAAAGATCCA TACAGTGGGA AGAGAGGCTG CAGGAACAGC GGAGAACAGT TCAGGACAAG 1140 

AAGAAAACAG CCGGGCGCAC CAGTCGTAGT AATCCCCCCA AACCAAAGGG AAAGCCTCCT 1200 

GCTCCCAAAC CAGCCAGTCC CAAGAAGAAC ATTAAAACTA GGAGTGCCCA GAAGAGAACA 1260 

AACCCGAAAA GAGTGTGAGC TAACTAGTTT CCAAAGCGGA GACTTCCGAC TTCCTTACAG 1320 

GATGAGGCTG GGCATTGCCT GGGACAGCCT ATGTAAGGCC ATGTGCCCCT TGCCCTAACA 1380 
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ACTCACTGCA GTGCTCTTCA TAGACACATC TTGCAGCATT TTTCTTAAGG CTATGCTTCA 1440 

GTTTTTCTTT GTAAGCCATC ACAAGCCATA GTGGTAGGTT TGCCCTTTGG TACAGAAGGT 1500 

GAGTTAAAGC TGGTGGAAAA GGCTTATTGC ATTGCATTCA GAGTAACCTG TGTGCATACT 1560 

CTAGAAGAGT AGGGAAAATA ATGCTTGTTA CAATTCGACC TAATATGTGC ATTGTAAAAT 1620 

AAATGCCATA TTTCAAACAA AACACGTAAT TTTTTTACAG TATGTTTTAT TACCTTTTGA 1680 

TATCTGTTGT TGCAATGTTA GTGATGTTTT AAAATGTGAT GAAAATATAA TGTTTTTAAG 1740 

AAGGAACAGT AGTGGAATGA ATGTTAAAAG ATCTTTATGT GTTTATGGTC TGCAGAAGGA 1800 

TTTTTGTGAT GAAAGGGGAT TTTTTGAAAA ATTAGAGAAG TAGCATATGG AAAATTATAA 1860 

TGTGTTTTTT TACCAATGAC TTCAGTTTCT GTTTTTAGCT AGAAACTTAA AAACAAAAAT 1920 

AATAATAAAG AAAAATAAAT AAAAAGGAGA GGCAGACAAT GTCTGGATTC CTGTTTTTTG 1980 

GTTACCTGAT TTCCATGATC ATGATGCTTC TTGTCAACAC CCTCTTAAGC AGCACCAGAA 2040 

ACAGTGAGTT TGTCTGTACC ATTAGGAGTT AGGTACTAAT TAGTTGGCTA ATGCTCAAGT 2100 

ATTTTATACC CACAAGAGAG GTATGTCACT CATCTTACTT CCCAGGACAT CCACCCTGAG 2160 

AATAATTTGA CAAGCTTAAA AATGGCCTTC ATGTGAGTGC CAAATTTTGT TTTTCTTCAT 2220 

TTAAATATTT TCTTTGCCTA AATACATGTG AGAGGAGTTA AATATAAATG TACAGAGAGG 2280 

AAAGTTGAGT TCCACCTCTG AAATGAGAAT TACTTGACAG TTGGGATACT TTAATCAGAA 2340 

AAAAAGAACT TATTTGCAGC ATTTTATCAA CAAATTTCAT AATTGTGGAC AAT TGGA GGC 2400 

ATTTATTTTA AAAAACAATT TTATTGGCCT TTTGCTAACA CAGTAAGCAT GTATTTTATA 2460 

AGGCATTCAA TAAATGCACA ACGCCCAAAG GAAATAAAAT CCTATCTAAT CCTACTCTCC 2520 

ACTACACAGA GGTAATCACT ATTAGTATTT TGGCATATTA TTCTCCAGGT GTTTGCTTAT 2580 

GCACTTATAA AATGATTTGA ACAAATAAAA CTAGGAACCT GTATACATGT GTTTCATAAC 2640 

CTGCCTCCTT TGCTTGGCCC TTTATTGAGA TAAGTTTTCC TGTCAAGAAA GCAGAAACCA 2700 

TCTCATTTCT AACAGCTGTG TTATATTCCA TAGTATGCAT TACTCAACAA ACTGTTGTGC 2760 
TATTGGATAC TTAGGTGGTT TCTTCACTGA CAATACTGAA TAAACATCTC ACCGGAATTC 

Seq ID NO: 423 Protein sequence 
Protein Accession ft: NP_0 03 005.1. 

1 11 21 31 41 51 

I I I I I I 

MFLSILVALC LWLHLALGVR GAPCEAVRIP MCRHMPWNIT RMPNHLHHST QENAILAIEQ 60 

YEELVDVNCS AVLRPFPCAM YAPICTLBFL HDPIKPCKSV CQRARDDCEP LMKMYNHSWP 120 

ESLACDELPV YDRGVCISPE AIVTDLPEDV KWIDITPDMM VQERPLDVDC KRLSPDRCKC 180 

KKVKPTLATY LSKNYSYVIH AKIKAVQRSG CNEVTTWDV KEIFKSSSPI PRTQVPLITN 240 

SSCOCPH I LP HQDVLIMCYE WRSRMMLLEN CLVEKWRDQL SKRSIQWEER LQEQRRTVQD 300 
KKKTAGRTSR SNPPKPKGKP PAPKPASPKK NIKTRSAQKR TNPKRV 

Seq ID NO: 424 DNA sequence 
Nucleic Acid Accession #: BC010423 
Coding sequence: 248.. 1780 

1 11 21 31 41 51 

I I I I I I 

CACAGCGTGG GAAGCAGCTC TGGGGGAGCT CGGAGCTCCC GATCACGGCT TCTTGGGGGT 60 

AGCTACGGCT GGGTGTGTAG AACGGGGCCG GGGCTGGGGC TGGGTCCCCT AGTGGAGACC 120 

CAAGTGCGAG AGGCAAGAAC TCTGCAGCTT CCTGCCTTCT GGGTCAGTTC CTTATTCAAG 180 

TCTGCAGCCG GCTCCCAGGG AGATCTCGGT GGAACTTCAG AAACGCTGGG CAGTCTGCCT 240 

TTCAACCATG CCCCTGTCCC TGGGAGCCGA GATGTGGGGG CCTGAGGCCT GGCTGCTGCT 300 

GCTGCTACTG CTGGCATCAT TTACAGGCCG GTGCCCCGCG GGTGAGCTGG AGACCTCAGA 360 

CGTGGTAACT GTGGTGCTGG GCCAGGACGC AAAACTGCCC TGCTTCTACC GAGGGGACTC 420 

CGGCGAGCAA GTGGGGCAAG TGGCATGGGC TCGGGTGGAC GCGGGCGAAG GCGCCCAGGA 460 

ACTAGCGCTA CTGCACTCCA AATACGGGCT TCATGTGAGC CCGGCTTACG AGGGCCGCGT 540 

GGAGCAGCCG CCGCCCCCAC GCAACCCCCT GGACGGCTCA GTGCTCCTGC GCAACGCAGT 600 

GCAGGCGGAT GAGGGCGAGT ACGAGTGCCG GGTCAGCACC TTCCCCGCCX3 GCAGCTTCCA 660 

GGCGCGGCTG CGGCTCCGAG TGCTGGTGCC TCCCCTGCCC TCACTGAATC CTGGTCCAGC 720 

ACTAGAAGAG GGCCAGGGCC TGACCCTGGC AGCCTCCTGC ACAGCTGAGG GCAGCCCAGC 780 

CCCCAGCGTG ACCTGGGACA CGGAGGTCAA AGGCACAACG TCCAGCOGTT CCTTCAAGCA 840 

CTCCCGCTCT GCTGCCGTCA CCTCAGAGTT CCACTTGGTG CCTAGCCGCA GCATGAATGG 90 0 

GCAGCCACTG ACTTGTGTGG TGTCCCATCC TGGCCTGCTC CAGGACCAAA GGATCACCCA 960 

CATCCTCCAC GTGTCCTTCC TTGCTGAGGC CTCtGTGAGG GGCCTTGAAG ACCAAAATCT 1020 

GTGGCACATT GGCAGAGAAG GAGCTATGCT CAAGTGCCTG AGTGAAGGGC AGCCCCCTCC 1080 

CTCATACAAC TGGACACGGC TGGATGGGCC TCTGCCCAGT GGGGTACGAG TGGATGGGGA 1140 

CACTTTGGGC TTTCCCCCAC TGACCACTGA GCACAGCGGC ATCTACGTCT GCCATGTCAG 1200 

CAATGAGTTC TCCTCAAGGG ATTCTCAGGT CACTGTGGAT GTTCTTGACC CCCAGGAAGA 1260 

CTCTGGGAAG CAGGTGGACC TAGTGTCAGC CTCGGTGGTG GTGGTGGGTG TGATCGCCGC 1320 

ACTCTTGTTC TGCCTTCTGG TGGTGGTGGT GGTGCTCATG TCCCGATACC ATOGGCGCAA 1380 

GGCCCAGCAG ATGACCCAGA AATATGAGGA GGAGCTGACC CTGACCAGGG AGAACTCCAT 1440 

CCGGAGGCTG CATTCCCATC ACACGGACCC CAGGAGCCAG CCGGAGGAGA GTGTAGGGCT 1500 

GAGAGCCGAG GGCCACCCTG ATAGTCTCAA GGACAACAGT AGCTGCTCTG TGATGAGTGA 1560 

AGAGCCCGAG GGCCGCAGTT ACTCCACGCT GACCACGGTG AGGGAGATAG AAACACAGAC 1620* 

TGAACTGCTG TCTCCAGGCT CTGGGCGGGC CGAGGAGGAG GAAGATCAGG ATGAAGGCAT 1680 

CAAACAGGCC ATGAACCATT TTGTTCAGGA GAATGGGACC CTACGGGCCA AGCCCACGGG 1740 

CAATGGCATC TACATCAATG GGCGGGGACA CCTGGTCTGA CCCAGGCCTG CCTCCCTTCC 1800 

CTAGGCCTGG CTCCTTCTGT TGACATGGGA GATTTTAGCT CATCTTGGGG GCCTCCTTAA 1860 

ACACCCCCAT TTCTTGCGGA AGATGCTCCC CATCCCACTG ACTGCTTGAC CTTTACCTCC 1920 

AACCCTTCTG TTCATCGGGA GGGCTCCACC AATTGAGTCT CTCCCACCAT GCATGCAGGT 1980 

CACTGTGTGT GTGCATGTGT GCCTGTGTGA GTGTTGACTG ACTGTGTGTG TGTGGAGGGG 2040 

TGACTGTCCG TGGAGGGGTG ACTGTGTCCG TGGTGTGTAT TATGCTGTCA TATCAGAGTC 2100 

AAGTGAACTG TGGTGTATGT GCCACGGGAT TTGAGTGGTT GCGTGGGCAA CACTGTCAGG 2160 

GTTTGGCGTG TGTGTCATGT GGCTGTGTGT GACCTCTGCC TGAAAAAGCA GGTATTTTCT 2220 

CAGACCCCAG AGCAGTATTA ATGATGCAGA GGTTGGAGGA GAGAGGTGGA GACTGTGGCT 2280 

CAGACCCAGG TGTGCGGGCA TAGCTGGAGC TGGAATCTGC CTCCGGTGTG AGGGAACCTG 2340 

TCTCCTACCA CTTCGGAGCC ATGGGGGCAA GTGTGAAGCA GCCAGTCCCT GGGTCAGCCA 2400 

GAGGCTTGAA CTGTTACAGA AGCCCTCTGC CCTCTGGTGG CCTCTGGGCC TGCTGCATGT 2460 

ACATATTTTC TGTAAATATA CATGCGCCGG GAGCTTCTTG CAGGAATACT GCTCCGAATC 2520 

ACTTTTAATT TTTTTCTTTT TTTTTTCTTG CCCTTTCCAT TAGTTGTATT TTTTATTTAT 2580 

TTTTATTTTT ATTTTTTTTT AGAGTTTGAG TCCAGCCTGG ACGATATAGC CAGACCCTGT 2640 
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CTGTAAAAAA ACCAAAACCC AAAAAAAAAA AAAAAAAAAA 

Seq ID NO: 425 Protein sequence 
Protein Accession $ : AAH10423 

1 11 21 31 41 51 

I I I I I I 

MPLSLGAEMW GPEAWLLLLL LLASFTGRCP AGELETSDW TWLGQDAKL PCFYRGDSGE 60 

QVGQVAWARV DAGEGAQELA LLHSKYGLHV SPAYEGRVEQ PPPPRNPLDG SVLLRNAVQA 120 

DEGEYECRVS TFPAGSFQAR LRLRVLVPPL PSLNPGPALE EGQGLTLAAS CTAEGSPAPS 180 

VTWDTEVKGT TSSRSPKHSR SAAVTSEFHL VPSRSMNGQP LTCWSHPGL LQDQRITHIL 240 

HVSFLAEASV RGLEDQNLWH IGREGAMLKC LSEGQPPPSY NWTRLDGPLP SGVRVDGDTL 300 

GPPPLTTEHS GIYVCHVSNE FSSRDSQVTV DVLDPQEDSG KQVDLVSASV VWGVIAALL 360 

FCLLWWVL MSRYKRRXAQ QMTQKYEEEL TLTRENSIRR LHSHHTDPRS QPEESVGLRA 420 

EGHPDSLKDN SSCSVMSEEP EGRSYSTLTT VREIETQTEL LSPGSGRAEE EEDQDEGIKQ 480 
AMNHFVQENG TLRAKPTGNG IYINGRGHLV 

Seq ID NO: 426 DNA sequence 

Nucleic Acid Accession #: NM_003474.2 

Coding sequence : 3 7 . . 3 03 6 

1 11 21 31 41 51 

I I I I I I 

CACTAACGCT CTTCCTAGTC CCCGGGCCAA CTCGGACAGT TTGCTCATTT ATTGCAACGG 60 

TCAAGGCTGG CTTGTGCCAG AACGGCGCGC GCGCGACGCA CGCACACACA CGGGGGGAAA 120 

CTTTTTTAAA AATGAAAGGC TAGAAGAGCT CAGCGGCGGC GCGGGCCGTG CGCGAGGGCT 180 

CCGGAGCTGA CTCGCOGAGG CAGGAAATCC CTCCGGTCGC GACGCCCGGC CCCGCTCGGC 240 

GCCCGCGTGG GATGGTGCAG CGCTCGCCGC CGGGCCCGAG AGCTGCTGCA CTGAAGGCCG 300 

GCGACGATGG CAGCGCGCCC GCTGCCCGTG TCCCCCGCCC GCGCCCTCCT GCTCGCCCTG 360 

GCCGGTGCTC TGCTCGCGCC CTGCGAGGCC CGAGGGGTGA GCTTATGGAA CGAAGGAAGA 420 

GCTGATGAAG TTGTCAGTGC CTCTGTTCGG AGTGGGGACC TCTGGATCCC AGTGAAGAGC 480 

TTCGACTCCA AGAATCATCC AGAAGTGCTG AATATTCGAC TACAACGGGA AAGCAAAGAA 540 

CTGATCATAA ATCTGGAAAG AAATGAAGGT CTCATTGCCA GCAGTTTCAC GGAAACCCAC 600 
TATCTGCAAG ACGGTACTGA TGTCTCCCTC GCTCGAAATT ACACGGTAAT TCTGGGTCAC • 660 

TGTTACTACC ATGGACATGT ACGGGGATAT TCTGATTCAG CAGTCAGTCT CAGCACGTGT 720 

TCTGGTCTCA GGGGACTTAT TGTGTTTGAA AATGAAAGCT ATGTCTTAGA ACCAATGAAA 780 

AGTGCAACCA ACAGATACAA ACTCTTCCCA GCGAAGAAGC TGAAAAGCGT CCGGGGATCA 840 

TGTGGATCAC ATCACAACAC ACCAAACCTC GCTGCAAAGA ATGTGTTTCC ACCACCCTCT 900 

CAGACATGGG CAAGAAGGCA TAAAAGAGAG ACCCTCAAGG CAACTAAGTA TGTGGAGCTG 960 

GTGATCGTGG CAGACAACCG AGAGTTTCAG AGGCAAGGAA AAGATCTGGA AAAAGTTAAG 1020 

CAGCGATTAA TAGAGATTGC TAATCACGTT GACAAGTTTT ACAGACCACT GAACATTCGG 1080 

ATCGTGTTGG TAGGCGTGGA AGTGTGGAAT GACATGGACA AATGCTCTGT AAGTCAGGAC 1140 

CCATTCACCA GCCTCCATGA ATTTCTGGAC TGGAGGAAGA TGAAGCTTCT ACCTCGCAAA 1200 

TCCCATGACA ATGCGCAGCT TGTCAGTGGG GTTTATTTCC AAGGGACCAC CATCGGCATG 1260 

GCCCCAATCA TGAGCATGTG CACGGCAGAC CAGTCTGGGG GAATTGTCAT GGACCATTCA 1320 

GACAATCCCC TTGGTGCAGC CGTGACCCTG GCACATGAGC TGGGCCACAA TTTCGGGATG 1380 

AATCATGACA CACTGGACAG GGGCTGTAGC TGTCAAATGG CGGTTGAGAA AGGAGGCTGC 1440 

ATCATGAACG CTTCCACCGG GTACCCATTT CCCATGGTGT TCAGCAGTTG CAGCAGGAAG 1500 

GACTTGGAGA CCAGCCTGGA GAAAGGAATG GGGGTGTGCC TGTTTAACCT GCCGGAAGTC 1560 

AGGGAGTCTT TCGGGGGCCA GAAGTGTGGG AACAGATTTG TGGAAGAAGG AGAGGAGTGT 1620 

GACTGTGGGG AGCCAGAGGA ATGTATGAAT CGCTGCTGCA ATGCCACCAC CTGTACCCTG 1680 

AAGCCGGACG CTGTGTGCGC ACATGGGCTG TGCTGTGAAG ACTGCCAGCT GAAGCCTGCA 1740 

GGAACAGCGT GCAGGGACTC CAGCAACTCC TGTGACCTCC CAGAGTTCTG CACAGGGGCC 1800 

AGCCCTCACT GCCCAGCCAA OGTGTACCTG CACGATGGGC ACTCATGTCA GGATGTGGAC 1860 

GGCTACTGCT ACAATGGCAT CTGCCAGACT CACGAGCAGC AGTGTGTCAC ACTCTGGGGA 1920 

CCAGGTGCTA AACCTGCCCC TGGGATCTGC TTTGAGAGAG TCAATTCTGC AGGTGATCCT 1980 

TATGGCAACT GTGGCAAAGT CTCGAAGAGT TCCTTTGCCA AATGCGAGAT GAGAGATGCT 2040 

AAATGTGGAA AAATCCAGTG TCAAGGAGGT GCCAGCCGGC CAGTCATTGG TACCAATGCC 2100 

GTTTCCATAG AAACAAACAT CCCCCTGCAG CAAGGAGGCC GGATTCTGTG CCGGGGGACC 2160 

CACGTGTACT TGGGCGATGA CATGCCGGAC CCAGGGCTTG TGCTTGCAGG CACAAAGTGT 2220 

GCAGATGGAA AAATCTGCCT GAATCGTCAA TGTCAAAATA TTAGTGTCTT TGGGGTTCAC 2280 

GAGTGTGCAA TGCAGTGCCA CGGCAGAGGG GTGTGCAACA ACAGGAAGAA CTGCCACTGC 2340 

GAGGCCCACT GGGCACCTCC CTTCTGTGAC AAGTTTGGCT TTGGAGGAAG CACAGACAGC 2400 

GGCCCCATCC GGCAAGCAGA TAACCAAGGT TTAACCATAG GAATTCTGGT GACCATCCTG 2460 

TGTCTTCTTG CTGCCGGATT TGTGGTTTAT CTCAAAAGGA AGACCTTGAT ACGACTGCTG 2520 

TTTACAAATA AGAAGACCAC CATTGAAAAA CTAAGGTGTG TGCGCCCTTC CCGGCCACCC 2580 

CGTGGCTTCC AACCCTGTCA GGCTCACCTC GGCCACCTTG GAAAAGGCCT GATGAGGAAG 2640 

CCGCCAGATT CCTACCCACC GAAGGACAAT CCCAGGAGAT TGCTGCAGTG TCAGAATGTT 2700 

GACATCAGCA GACCCCTCAA CGGCCTGAAT GTCCCTCAGC CCCAGTCAAC TCAGCGAGTG 2760 

CTTCCTCCCC TCCACCGGGC CCCACGTGCA CCTAGCGTCC CTGCCAGACC CCTGCCAGCC 2820 

AAGCCTGCAC TTAGGCAGGC CCAGGGGACC TGTAAGCCAA ACCCCCCTCA GAAGCCTCTG 2880 

CCTGCAGATC CTCTGGCCAG AACAACTCGG CTCACTCATG CCTTGGCCAG GACCCCAGGA 2940 

CAATGGGAGA CTGGGCTCCG CCTGGCACCC CTCAGACCTG CTCCACAATA TCCACACCAA 3000 

GTGCCCAGAT CCACCCACAC CGCCTATATT AAGTGAGAAG CCGACACCTT TTTTCAACAG 3060 

TGAAGACAGA AGTTTGCACT ATCTTTCAGC TCCAGTTGGA GTTTTTTGTA CCAACTTTTA 3120 

GGATTTTTTT TAATGTTTAA AACATCATTA CTATAAGAAC TTTGAGCTAC TGCCGTCAGT 3180 

GCTGTGCTGT GCTATGGTGC TCTGTCTACT TGCACAGGTA CTTGTAAATT ATTAATTTAT 3240 

GCAGAATGTT GATTACAGTG CAGTGCGCTG TAGTAGGCAT TTTTACCATC ACTGAGTTTT 3300 

CCATGGCAGG AAGGCTTGTT GTGCTTTTAG TATTTTAGTG AACTTGAAAT ATCCTGCTTG 3360 

ATGGGATTCT GGACAGGATG TGTTTGCTTT CTGATCAAGG CCTTATTGGA AAGCAGTCCC 3420 

CCAACTACCC CCAGCTGTGC TTATGGTACC AGATGCAGCT CAAGAGATCC CAAGTAGAAT 3480 

CTCAGTTGAT TTTCTGGATT CCCCATCTCA GGCCAGAGCC AAGGGGCTTC AGGTCCAGGC 3540 

TGTGTTTGGC TTTCAGGGAG GCCCTGTGCC CCTTGACAAC TGGCAGGCAG GCTCCCAGGG 3600 

ACACCTGGGA GAAATCTGGC TTCTGGCCAG GAAGCTTTGG TGAGAACCTG GGTTGCAGAC 3660 

AGGAATCTTA AGGTGTAGCC ACACCAGGAT AGAGACTGGA ACACTAGACA AGCCAGAACT 3720 

TGACCCTGAG CTGACCAGCC GTGAGCATGT TTGGAAGGGG TCTGTAGTGT CACTCAAGGC 3 7 B0 

GGTGCTTGAT AGAAATGCCA AGCACTTCTT TTTCTCGCTG TCCTTTCTAG AGCACTGCCA 3840 
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CCAGTAGGTT ATTTAGCTTG GGAAAGGTGG TGTTTCTGTA AGAAACCTAC TGCCCAGGCA 3900 

CTGCAAACCO CCACCTCCCT ATACTGCTTG GAGCTGAGCA AATCACCACA AACTGTAATA 3960 

CAATGATCCT GTATTCAGAC AGATGAGGAC TTTCCATGGG ACCACAACTA TTTTCAGATG 4020 

TGAACCATTA ACCAGATCTA GTCAATCAAG TCTGTTTACT GCAAGGTTCA ACTTATTAAC 4080 

AATTAGGCAG ACTCTTTATG CTTGCAAAAA CTACAACCAA TGGAATGTGA TGTTCATGGG 4140 

TATAGTTCAT GTCTGCTATC ATTATTCGTA GATATTGGAC AAA GAACC TT CTCTATGGGG 4200 

CATCCTCTTT TTCCAACTTG GCTGCAGGAA TCTTTAAAAG ATGCTTTTAA CAGAGTCTGA 4260 

ACCTATTTCT TAAACACTTG CAACCTACCT GTTGAGCATC ACAGAATGTG ATAAGGAAAT 4320 

CAACTTGCTT ATCAACTTCC TAAATATTAT GAGATGTGGC TTGGGCAGCA TCCCCTTGAA 4380 

CTCTTCACTC TTCAAATGCC TGACTAGGGA GCCATGTTTC ACAAGGTCTT TAAAGTGACT 4440 

AATGGCATGA GAAATACAAA AATACTCAGA TAAGGTAAAA TGCCATGATG CCTCTGTCTT 4S00 

CTGGACTGGT TTTCACATTA GAAGACAATT GACAACAGTT ACATAATTCA CTCTGAGTGT 4 560 

TTTATGAGAA AGCCTTCTTT TGGGGTCAAC AGTTTTCCTA TGCTTTGAAA CAGAAAAATA 4620 

TGTACCAAGA ATCTTGGTTT GCCTTCCAGA AAACAAAACT GCATTTCACT TTCCCGGTGT 4680 

TCCCCACTGT ATCTAGGCAA CATAGTATTC ATGACTATGG ATAAACTAAA CACGTGACAC 4740 

AAACACACAC AAAAGGGAAC CCAGCTCTAA TACATTCCAA CTCGTATAGC ATGCATCTGT 4800 

TTATTCTATA GTTATTAAGT TCTTTAAAAT GTAAAGCCAT GCTGGAAAAT AATACTGCTG 4860 

AGATACATAC AGAATTACTG TAACTGATTA CACTTGGTAA TTGTACTAAA GCCA AACATA 4920 

TATATACTAT TAAAAAGGTT TACAGAATTT TATGGTGCAT TACGTGGGCA TTGTCTTTTT 4980 

AGATGCCCAA ATCCTTAGAT CTGGCATGTT AGCCCTTCCT CCAATTATAA GAGGATATGA 5040 
ACCAAAAAAA AAAAAAAAAA AA 

Seq ID NOs 427 Protein sequence 
Protein Accession ft: NP_003465 

1 11 21 31 41 51 

| I I I I 

MAARPLPVSP ARALLLALAG ALLAPCEARG VSLWNEGRAD EWSASVRSG DLWIPVKSFD 60 

SKNHPEVLNI RLQRESKELI INLERNEGLI ASS FT ETHYL QDGTDVSLAR NYTVILGHCY 120 

YHGHVRGYSD SAVSLSTCSG LRGLIVFENE SYVLEPMKSA TNRYKLFPAK KLKSVRGSCG 180 

SHHNTPNLAA KNVFPPPSQT WARRHKRETL KATKYVELVI VADNREFQRQ GKDLEKVKQR 240 

LIEIANHVDK FYRPLNIRIV LVGVEVWNDM DKCSVSQDPF TSLHEFIiDWR KMKLLPRKSH 300 

DNAQLVSGVY FQGTTIGMAP IMSMCTADQS GGIVMDHSDN PLGAAVTLAH ELGHNFGMNH 360 

DTLDRGCSCQ MAVEKGGCIM NASTGYPFPM VFSSCSRKDL ETSLEKGMGV CLFNLPEVRE 420 

SFGGQKCGNR FVEEGEECDC GEPEECMNRC CNATTCTLKP DAVCAHGLCC EDCQLKPAGT 480 

ACRDSSNSCD LPEFCTGASP HCPANVYLHD GHSCQDVDGY CYNGI CQTHE QQCVTLWGPG 540 

AKPAPGICFE RVNSAGDPYG NCGKVSKSSF AKCEMRDAKC GKIQCQGGAS RPVIGTNAVS 600 

IBTNIPLQQG GRILCRGTHV YLGDDMPDPG LVLAGTKCAD GKICLNRQCQ NISVFGVHEC 660 

AMQCHGRGVC NNRKNCHCEA HWAPPFCDKF GFGGSTDSGP IRQADNQGLT IGILVTILCL 720 

LAAGFWYLK RKTLIRLLFT NKKTTIEKLR CVRPSRPPRG FQPCQAHLGH LGKGLMRKPP 780 

DSYPPKDNPR RLLQCQNVDI SRPLNGLNVP QPQSTQRVLP PLHRAPRAPS VPARPI*PAKP 840 

ALRQAQGTCK PNPPQKPLPA DPIARTTRLT HALARTPGQW ETGLRLAPLR PAPQYPHQVP 900 
RSTHTAYIK 

Seq ID NO: 428 DNA sequence 

Nucleic Acid Accession #« NM_003714 

Coding sequence; 135.. 1043 

1 li 21 31 41 51 

III III 

GAGGAGGAGG GAAAAGGCGA GCAAAAAGGA AGAGTGGGAG GAGGAGGGGA AGCGGCGAAG 60 

GAGGAAGAGG AGGAGGAGGA AGAGGGGAGC ACAAAGGATC CAGGTCTCCC GACGGGAGGT 120 

TAATACCAAG AACCATGTGT GCCGAGCGGC TGGGCCAGTT CATGACCCTG GCTTTGGTGT 180 

TGGCCACCTT TGACCCGGCG CGGGGGACCG ACGCCACCAA CCCACCCGAG GGTCCCCAAG 240 

ACAGGAGCTC CCAGCAGAAA GGCCGCCTGT CCCTGCAGAA TACAGCGGAG ATCCAGCACT 300 

GTTTGGTCAA CGCTGGCGAT GTGGGGTGTG GCGTGTTTGA ATGTTTCGAG AACAACTCTT 360 

GTGAGATTCG GGGCTTACAT GGGATTTGCA TGACTTTTCT GCACAACGCT GGAAAATTTG 420 

ATGCCCAGGG CAAGTCATTC ATCAAAGACG CCTTGAAATG TAAGGCCCAC GCTCTGCGGC 480 

ACAGGTTCGG CTGCATAAGC CGGAAGTGCC CGGCCATCAG GGAAATGGTG TCCCAGTTGC 540 

AGCGGGAATG CTACCTCAAG CACGACCTGT GCGCGGCTGC CCAGGAGAAC ACCCGGGTGA 600 

TAGTGGAGAT GATCCATTTC AAGGACTTGC TGCTGCACGA ACCCTACGTG GACCTCGTGA 660 

ACTTGCTGCT GACCTGTGGG GAGGAGGTGA AGGAGGCCAT CACCCACAGC GTGCAGGTTC 720 

AGTGTGAGCA GAACTGGGGA AGCCTGTGCT CCATCTTGAG CTTCTGCACC TCGGCCATCC 780 

AGAAGCCTCC CACGGCGCCC CCCGAGCGCC AGCCCCAGGT GGACAGAACC AAGCTCTCCA 840 

GGGCCCACCA CGGGGAAGCA GGACATCACC TCCCAGAGCC CAGCAGTAGG GAGACTGGCC 900 

GAGGTGCCAA GGGTGAGCGA GGTAGCAAGA GCCACCCAAA CGCCCATGCC CGAGGCAGAG 960 

TCGGGGGCCT TGGGGCTCAG GGACCTTCCG GAAGCAGCGA GTGGGAAGAC GAACAGTCTG 1020 

AGTATTCTGA TATCCGGAGG TGAAATGAAA GGCCTGGCCA CGAAATCTTT CCTCCACGCC 1080 

GTCCATTTTC TTATCTATGG ACATTCCAAA ACATTTACCA TTAGAGAGGG GGGATGTCAC 1140 

ACGCAGGATT CTGTGGGGAC TGTGGACTTC ATCGAGGTGT GTGTTCGCGG AACGGACAGG 1200 

TGAGATGGAG ACCCCTGGGG CCGTGGGGTC TCAGGGGTGC CTGGTGAATT CTGCACTTAC 1260 

ACGTACTCAA GGGAGCGCGC CCGCGTTATC CTCGTACCTT TGTCTTCTTT CCATCTGTGG 1320 

AGTCAGTGGG TGTCGGCCGC TCTGTTGTGG GGGAGGTGAA CCAGGGAGGG GCAGGGCAAG 1380 

GCAGGGCCCC CAGAGCTGGG CCACACAGTG GGTGCTGGGC CTCGCCCCGA AGCTTCTGGT 1440 

GCAGCAGCCT CTGGTGCTGT CTCCGCGGAA GTCAGGGCGG CTGGATTCCA GGACAGGAGT 1500 

GAATGTAAAA ATAAATATCG CTTAGAATGC AGGAGAAGGG TGGAGAGGAG GCAGGGGCCG 1560 

AGGGGGTGCT TGGTGCCAAA CTGAAATTCA GTTTCTTGTG TGGGGCCTTG CGGTTCAGAG 1620 

CTCTTGGCGA GGGTGGAGGG AGGAGTGTCA TTTCTATGTG TAATT TCTGA GCCATTGTAC 1680 

TGTCTGGGCT GGGGGGGACA CTGTCCAAGG GAGTGGCCCC TATGAGTTTA TATTTTAACC 1740 

ACTGCTTCAA ATCTCGATTT CACTTTTTTT ATTTATCCAG TTATATCTAC ATATCTGTCA 1800 

TCTAAATAAA TGGCTTTCAA ACAAAGCAAC TGGGTCATTA AAACCAGCTC AAAGGGGGTT 1860 

TAAAAAAAAA AAAACCAGCC CATCCTTTGA GGCTGATTTT TCTTTTTTTT AAGTTCTATT 1920 

TTAAAAGCTA TCAAACAGCG ACATAGCCAT ACATCTGACT GCCTGACATG GACTCCTGCC 1980 

CACTTGGGGG AAACCTTATA CCCAGAGGAA AATACACACC TGGGGAGTAC ATTTGACAAA 2040 

TTTCCCTTAG GATTTCGTTA TCTCACCTTG ACCCTCAGCC AAGATTGGTA AAGCTGOGTC 2100 

CTGGCGATTC CAGGAGACCC AGCTGGAAAC CTGGCTTCTC CATGTGAGGG GATGGGAAAG 2160 

GAAAGAAGAG AATGAAGACT ACTTAGTAAT TCCCATCAGG AAATGCTGAC CTTTTACATA 2220 
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AAATCAAGGA GACTGCTGAA AATCTCTAAG GGACAGGATT TTCCAGATCC TAATTGGAAA 2280 

TTTAGCAATA AGGAGAGGAG TCCAAGGGGA CAAATAAAGG CAGAGAGAGA GAGAGAGAGA 2340 
GGGAGAGGAA GAAAAGAGAG AGAGAAAAGA GCCTCGTGCC 

Seq ID NOz 429 Protein sequence 
Protein Accession #: NP_003705 

1 11 21 31 .41 51 

I I I I I I 

MCAERLGQFM TLALVLATFD PARGTDATNP PEGPQDRSSQ QKGRLSLQNT AEIQHCLVNA 60 

GDVGCGVPEC FENNSCEIRG LHGICMTPLH NAGKFDAQGK SFIKDALKCK AHALRHRFGC 120 

ISRKCPAIRE MVSQLQRECY LKHDLCAAAQ ENTRVIVEMI HFKDLLLHEP YVDLVNLLLT 180 

CGEEVKEAIT HSVQVQCEQN WGSLCSILSF CTSAIQKPPT APPERQPQVD RTKLSRAHHG 240 

EAGHHLPEPS SRETGRGAKG ERGSKSHPNA HARGRVGGLG AQGPSGSSEW EDEQSEYSDI 300 
RR 

Seq ID NO: 430 DNA sequence 

Nucleic Acid Accession 8-: NMJ)05940 

Coding sequence: 23.. 1489 

1 11 21 31 41 51 

I I I I I I 

AAGCCCAGCA GCCCCGGGGC GGATGGCTCC GGCCGCCTGG CTCCGCAGCG CGGCCGCGCG 60 

CGCCCTCCTG CCCCCGATGC TGCTGCTGCT GCTCCAGCCG CCGCCGCTGC TGGCCCGGGC 120 

TCTGCCGCCG GACGTCCACC ACCTCCATGC CGAGAGGAGG GGGCCACAGC CCTGGCATGC 180 

AGCCCTGCCC AGTAGCCCGG CACCTGCCCC TGCCACGCAG GAAGCCCCCC GGCCTGCCAG 240 

CAGCCTCAGG CCTCCCCGCT GTGGCGTGCC CGACCCATCT GATGGGCTGA GTGCCCGCAA 300 

CCGACAGAAG AGGTTCGTGC TTTCTGGCGG GCGCTGGGAG AAGACGGACC TCACCTACAG 360 

GATCCTTCGG TTCCCATGGC AGTTGGTGCA GGAGCAGGTG CGGCAGACGA TGGCAGAGGC 420 

CCTAAAGGTA TGGAGCGATG TGACGCCACT CACCTTTACT GAGGTGCACG AGGGCCGTGC 480 

TGACATCATG ATCGACTTCG CCAGGTACTG GCATGGGGAC GACCTGCCGT TTGATGGGCC 540 

TGGGGGCATC CTGGCCCATG CCTTCTTCCC CAAGACTCAC CGAGAAGGGG ATGTCCACTT 600 

CGACTATGAT GAGACCTGGA CTATCGGGGA TGACCAGGGC ACAGACCTGC TGCAGGTGGC 660 

AGCCCATGAA TTTGGCCACG TGCTGGGGCT GCAGCACACA ACAGCAGCCA AGGCCCTGAT 720 

GTCCGCCTTC TACACCTTTC GCTACCCACT GAGTCTCAGC CCAGATGACT GCAGGGGCGT 780 

TCAACACCTA TATGGCCAGC CCTGGCCCAC TGTCACCTCC AGGACCCCAG CCCTGGGCCC 840 

CCAGGCTGGG ATAGACACCA ATGAGATTGC ACCGCTGGAG CCAGACGCCC CGCCAGATGC 900 

CTGTGAGGCC TCCTTTGACG CGGTCTCCAC CATCCGAGGC GAGCTCTTTT TCTTCAAAGC 960 

GGGCTTTGTG TGGCGCCTCC GTGGGGGCCA GCTGCAGCCC GGCTACCCAG CATTGGCCTC 1020 

TCGCCACTGQ CAGGGACTGC CCAGCCCTGT GGACGCTGCC TTCGAGGATG CCCAGGGCCA 1080 

CATTTGGTTC TTCCAAGGTG CTCAGTACTG GGTGTACGAC GGTGAAAAGC CAGTCCTGGG 1140 

CCCCGCACCC CTCACCGAGC TGGGCCTGGT GAGGTTCCCG GTCCATGCTG CCTTGGTCTG 1200 

GGGTCCCGAG AAGAACAAGA TCTACTTCTT CCGAGGCAGG GACTACTGGC GTTTCCACCC 1260 

CAGCACCCGG CGTGTAGACA GTCCCGTGCC CCGCAGGGCC ACTGACTGGA GAGGGGTGCC 1320 

CTCTGAGATC GACGCTGCCT TCCAGGATGC TGATGGCTAT GCCTACTTCC TGCGCGGCCG 1380 

CCTCTACTGG AAGTTTGACC CTGTGAAGGT GAAGGCTCTG GAAGGCTTCC CCCGTCTCGT 1440 

GGGTCCTGAC TTCTTTGGCT GTGCCGAGCC TGCCAACACT TTCCTCTGAC CATGGCTTGG 1500 

ATGCCCTCAG GGGTGCTGAC CCCTGCCAGG CCACGAATAT CAGGCTAGAG ACCCATGGCC IS 60 

ATCTTTGTGG CTGTGGGCAC CAGGCATGGG ACTGAGCCCA TGTCTCCTGC AGGGGGATGG 1620 

GGTGGGGTAC AACCACCATG ACAACTGCCG GGAGGGCCAC GCAGGTCGTG GTCACCTGCC 1680 

AGCGACTGTC TCAGACTGGG CAGGGAGGCT TTGGCATGAC TTAAGAGGAA GGGCAGTCTT 1740 

GGGACCCGCT ATGCAGGTCC TGGCAAACCT GGCTGCCCTG TCTCATCCCT GTCCCTCAGG 1800 

GTAGCACCAT GGCAGGACTG GGGGAACTGG AGTGTCCTTG CTGTATCCCT GTTGTGAGGT 1860 

TCCTTCCAGG GGCTGGCACT GAAGCAAGGG TGCTGGGGCC CCATGGCCTT CAGCCCTGGC 1920 

TGAGCAACTG GGCTGTAGGG CAGGGCCACT TCCTGAGGTC AGGTCTTGGT AGGTGCCTGC 1980 

ATCTGTCTGC CTTCTGGCTG ACAATCCTGG AAATCTGTTC TCCAGAATCC AGGCCAAAAA 2040 

GTTCACAGTC AAATGGGGAG GGGTATTCTT CATGCAGGAG ACCCCAGGCC CTGGAGGCTG 2100 

CAACATACCT CAATCCTGTC CCAGGCCGGA TCCTCCTGAA GCCCTTTTCG CAGCACTGCT 2160 

ATCCTCCAAA GCCATTGTAA ATGTGTGTAC AGTGTGTATA AACCTTCTTC TTCTTTTTTT 2220 
TTTTTAAACT GAGGATTGTC ATTAAACACA GTTGTTTTCT 

Seq ID NO: 431 Protein sequence 
Protein Accession #: NP_005931 

1 11 21 31 41 51 

I I I I I I 

MAPAAWLRSA AARALLPPML LLLLQPPPLL ARALPPDVHH LHAERRGPQP WHAALPSSPA 60 

PAPATQEAPR PASSLRPPRC GVPDPSDGLS ARNRQKRPVL SGGRWEKTDL TYRILRFPWQ 120 

LVQEQVRQTM AEALKVWSDV TPLTPTEVHE GRADIMIDFA RYWHGDDLPP DGPGGILAHA 180 

FFPKTHREGD VHFDYDETWT IGDDQGTDLL QVAAHEFGHV LGLQHTTAAK ALMSAFYTFR 240 

YPLSLSPDDC RGVQHLYGQP WPTVTSRTPA LGPQAGIDTN EIAPLEPDAP PDACEASFDA 300 

VSTIRGELFF FKAGFVWRLR GGQLQPGYPA IASRHWQGLP SPVDAAFEDA QGHIWFFQGA 360 

QYWVYDGEKP VLGPAPLTEL GLVRPPVHAA LVWGPEKNKI YFFRGRDYWR FHPSTRRVDS 420 

PVPRRATDWR GVPSEIDAAF QDADGYAYFL RGRLYWKFDP VKVKALEGFP RLVGPDFFGC 480 
AEPANTFL 

Seq ID NO: 432 DNA sequence 

Nucleic Acid Accession 8: NMJ>24022 

Coding sequence: 202.. 1563 

1 11 21 31 41 51 

I I I I I I 

ACCGGGCACC GGACGGCTCG GGTACTTTCG TTCTTAATTA GGTCATGCCC GTGTGAGCCA 60 

GGAAAGGGCT GTGTTTATGG GAAGCCAGTA ACACTGTGGC CTACTATCTC TTCCGTGGTG 120 

CCATCTACAT TTTTGGGACT CGGGAATTAT GAGGTAGAGG TGGAGGCGGA GCCGGATGTC 180 

AGAGGTCCTG AAATAGTCAC CATGGGGGAA AATGATCCGC CTGCTGTTGA AGCCCCCTTC 240 

TCATTCCGAT CGCTTTTTGG CCTTGATGAT TTGAAAATAA GTCCTGTTGC ACCAGATGCA 300 
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GATGCTGTTG CTGCACAGAT CCTGTCACTG CTGCCATTGA AGTTTTTTCC AATCATCGTC 360 

ATTGGGATCA TTGCATTGAT ATTAGCACTG GCCATTGGTC TGGGCATCCA CTTCGACTGC 420 

TCAGGGAAGT ACAGATGTCG CTCATCCTTT AAGTGTATCG AGCTGATAGC TCGATGTGAC 480 

GGAGTCTCGG ATTGCAAAGA CGGGGAGGAC GAGTACCGCT GTGTCCGGGT GGGTGGTCAG 540 

AATGCCGTGC TCCAGGTGTT CACAGCTGCT TCGTGGAAGA CCATGTGCTC CGATGACTGG 600 

AAGGGTCACT AOGCAAATGT TGCCTGTGCC CAACTGGGTT TCCCAAGCTA TGTGAGTTCA 660 

GATAACCTCA GAGTGAGCTC GCTGGAGGGG CAGTTCCGGG AGGAGTTTGT GTCCATCGAT 720 

CACCTCTTGC CAGATGACAA GGTGACTGCA TTACACCACT CAGTATATGT GAGGGAGGGA 780 

TGTGCCTCTG GCCACGTGGT TACCTTGCAG TGCACAGCCT GTGGTCATAG AAGGGGCTAC 840 

AGCTCACGCA TCGTGGGTGg' AAACATGTCC TTGCTCTOGC AGTGGCCCTG GCAGGCCAGC 900 

CTTCAGTTCC AGGGCTACCA CCTGTGCGGG GGCTCTGTCA TCACGCCCCT GTGGATCATC 960 

ACTGCTGCAC ACTGTGTTTA TGACTTGTAC CTCCCCAAGT CATGGACCAT CCAGGTGGGT 1020 

CTAGTTTCCC TGTTGGACAA TCCAGCCCCA TCCCACTTGG TGGAGAAGAT TGTCTACCAC 1080 

AGCAAGTACA AGCCAAAGAG GCTGGGCAAT GACATCGCCC TTATGAAGCT GGCCGGGCCA 1140 

CTCACGTTCA ATGAAATGAT CCAGCCTGTG TGCCTGCCCA ACTCTGAAGA GAACTTCCCC 1200 

GATGGAAAAG TGTGCTGGAC GTCAGGATGG GGGGCCACAG AGGATGGAGG TGACGCCTCC 1260 

CCTGTCCTGA ACCAOGCGGC CGTCCCTTTG ATTTCCAACA AGATCTGCAA CCACAGGGAC 1320 

GTGTACGGTG GCATCATCTC CCCCTCCATG CTCTGCGCGG GCTACCTGAC GGGTGGCGTG 1380 

GACAGCTGCC AGGGGGACAG CGGGGGGCCC CTGGTGTGTC AAGAGAGGAG GCTGTGGAAG 1440 

TTAGTGGGAG CGACCAGCTT TGGCATCGGC TGCGCAGAGG TGAACAAGCC TGGGGTGTAC 1500 

ACCCGTGTCA CCTCCTTCCT GGACTGGATC CACGAGCAGA TGGAGAGAGA CCTAAAAACC 1560 

TGAAGAGGAA GGGGACAAGT AGCCACCTGA GTTCCTGAGG TGATGAAGAC AGCCCGATCC 1620 

TCCCCTGGAC TCCCGTGTAG GAACCTGCAC ACGAGCAGAC ACCCTTGGAG CTCTGAGTTC 1680 

CGGCACCAGT AGCAGGCCCG AAAGAGGCAC CCTTCCATCT GATTCCAGCA CAACCTTCAA 1740 

GCTGCTTTTT GTTTTTTGTT TTTTTGAGGT GGAGTCTCGC TCTGTTGCCC AGGCTGGAGT 1800 

GCAGTGGCGA AATCCCTGCT CACTGCAGCC TCCGCTTCCC TGGTTCAAGC GATTCTCTTG 1860 

CCTCAGCTTC CCCAGTAGCT GGGACCACAG GTGCCCGCCA CCACACCCAA CTAATTTTTG 1920 

TATTTTTAGT AGAGACAGGG TTTCACCATG TTGGCCAGGC TGCTCTCAAA CCCCTGACCT 1980 

CAAATGATGT GCCTGCTTCA GCCTCCCACA GTGCTGGGAT TACAGGCATG GGCCACCACG 2040 

CCTAGCCTCA CGCTCCTTTC TGATCTTCAC TAAGAACAAA AGAAGCAGCA ACTTGCAAGG 2100 

GCGGCCTTTC CCACTGGTCC ATCTGGTTTT CTCTCCAGGG GTCTTGCAAA ATTCCTGACG 2160 

AGATAAGCAG TTATGTGACC TCACGTGCAA AGCCACCAAC AGCCACTCAG AAAAGACGCA 2220 

CCAGCCCAGA AGTGCAGAAC TGCAGTCACT GCACGTTTTC ATCTCTAGGG ACCAGAACCA 2280 

AACCCACCCT TTCTACTTCC AAGACTTATT TTCACATGTG GGGAGGTTAA TCTAGGAATG 2340 

ACTCGTTTAA GGCCTATTTT CATGATTTCT TTGTAGCATT TGGTGCTTGA CGTATTATTG 2400 

TCCTTTGATT CCAAATAATA TGTTTCCTTC CCTCAAAAAA AAAAAAAAAA AAAAAAAAAA 2460 
AAAAA 

Seq ID NO: 433 Protein sequence 
Protein Accession #: NP_076927 

1 11 21 31 41 51 

I I I I I I 

MGENDPPAVE APFSFRSLFG LDDLKISPVA PDADAVAAQI LSUiPLKFFP IIVIGIIALI 60 

LALAIGLGIH FDCSGKYRCR SSFKCIELIA RCDGVSDCKD GEDEYRCVRV GGQNAVLQVF 120 

TAASWKTMCS DDWKGHYANV ACAQLGFPSY VSSDNLRVSS LEGQFREEFV SIDHLLPDDK 180 

VTALHHSVYV REGCASGHW TLQCTACGHR RGYSSRIVGG NMSLLSQWPW QASLQFQGYH 240 

LCGGSVITPL WIITAAHCVY DLYLPKSWTI QVGLVSLLDN PAPSHLVEKI VYHSKYKPKR 300 

LGNDIALMKL AGPLTFNEMI QPVCLPNSBS NFPDGKVCWT SGWGATEDGG DASPVLNHAA 360 

VPLISNKICN HRDVYGGIIS PSMLCAGYLT GGVDSCQGDS GGPLVCQERR LWKLVGATSF 420 
GIGCAEVNKP GVYTRVTSFL DWIHEQMERD LKT 

Seq ID NO: 434 DNA sequence 

Nucleic Acid Accession 8: NM_000493.2 

Coding sequence: 97. .2139 

1 11 21 31 41 51 

I I I I I I 

CACCTTCTGC ACTGCTCATC TGGGCAGAGG AAGCTTCAGA AAGCTGCCAA GGCACCATCT 60 

CCAGGAACTC CCAGCACGCA GAATCCATCT GAGAATATGC TGCCACAAAT ACCCTTTTTG 120 

CTGCTAGTAT CCTTGAACTT GGTTCATGGA GTGTTTTACG CTGAACGATA CCAAATGCCC 180 

ACAGGCATAA AAGGCCCACT ACCCAACACC AAGACACAGT TCTTCATTCC CTACACCATA 240 

AAGAGTAAAG GTATAGCAGT AAGAGGAGAG CAAGGTACTC CTGGTCCACC AGGCCCTGCT 300 

GGACCTCGAG GGCACCCAGG TCCTTCTGGA CCACCAGGAA AACCAGGCTA CGGAAGTCCT 360 

GGACTCCAAG GAGAGCCAGG GTTGCCAGGA CCACCGGGAC CATCAGCTGT AGGGAAACCA 420 

GGTGTGCCAG GACTCCCAGG AAAACCAGGA GAGAGAGGAC CATATGGACC AAAAGGAGAT 480 

GTTGGACCAG CTGGCCTACC AGGACCCCGG GGCCCACCAG GACCACCTGG AATCCCTGGA 540 

CCGGCTGGAA TTTCTGTGCC AGGAAAACCT GGACAACAGG GACCCACAGG AGCCCCAGGA 600 

CCCAGGGGCT TTCCTGGAGA AAAGGGTGCA CCAGGAGTCC CTGGTATGAA TGGACAGAAA 660 

GGGGAAATGG GATATGGTGC TCCTGGTCGT CCAGGTGAGA GGGGTCTTCC AGGCCCTCAG 720 

GGTCCCACAG GACCATCTGG CCCTCCTGGA GTGGGAAAAA GAGGTGAAAA TGGGGTTCCA 780 

GGACAGCCAG GCATCAAAGG TGATAGAGGT TTTCCGGGAG AAATGGGACC AATTGGCCCA 840 

CCAGGTCCCC AAGGCCCTCC TGGGGAACGA GGGCCAGAAG GCATTGGAAA GCCAGGAGCT 900 

GCTGGAGCCC CAGGCCAGCC AGGGATTCCA GGAACAAAAG GTCTCCCTGG GGCTCCAGGA 960 

ATAGCTGGGC CCCCAGGGCC 1 TCCTGGCTTT GGGAAACCAG GCTTGCCAGG CCTGAAGGGA 1020 

GAAAGAGGAC CTGCTGGCCT TCCTGGGGGT CCAGGTGCCA AAGGGGAACA AGGGCCAGCA 1080 

GGTCTTCCTG GGAAGCCAGG TCTGACTGGA CCCCCTGGGA ATATGGGACC CCAAGGACCA 1140 

AAAGGCATCC CGGGTAGCCA TGGTCTCCCA GGCCCTAAAG GTGAGACAGG GCCAGCTGGG 1200 

CCTGCAGGAT ACCCTGGGGC TAAGGGTGAA AGGGGTTCCC CTGGGTCAGA TGGAAAACCA 1260 

GGGTACCCAG GAAAACCAGG TCTCGATGGT CCTAAGGGTA ACCCAGGGTT ACCAGGTCCA 1320 

AAAGGTGATC CTGGAGTTGG AGGACCTCCT GGTCTCCCAG GCCCTGTGGG CCCAGCAGGA 1380 

GCAAAGGGAA TGCCCGGACA CAATGGAGAG GCTGGCCCAA GAGGTGCCCC TGGAATACCA 1440 

GGTACTAGAG GCCCTATTGG GCCACCAGGC ATTCCAGGAT TCCCTGGGTC TAAAGGGGAT 1500 

CCAGGAAGTC CCGGTCCTCC TGGCCCAGCT GGCATAGCAA CTAAGGGCCT CAATGGACCC 1560 

ACCGGGCCAC CAGGGCCTCC AGGTCCAAGA GGCCACTCTG GAGAGCCTGG TCTTCCAGGG 1620 

CCCCCTGGGC CTCCAGGCCC ACCAGGTCAA GCAGTCATGC CTGAGGGTTT TATAAAGGCA 1680 

GGCCAAAGGC CCAGTCTTTC TGGGACCCCT CTTGTTAGTG CCAACCAGGG GGTAACAGGA 1740 
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ATGCCTGTGT CTGCTTTTAC TGTTATTCTC TCCAAAGCTT ACCCAGCAAT A6GAACTCCC 1800 

ATACCATTTG ATAAAATTTT GTATAACAGG CAACAGCATT ATGACCCAAG GACTGGAATC 1860 

TTTACTTGTC AOATACCA6G AATATACTAT TTTTCATACC ACGTGCATGT GAAAGGGACT 1920 

CATGTTTGGG TAGGCCTGTA TAAGAATGGC ACCCCTGTAA TGTACACCTA TGATGAATAC 1980 

ACCAAAGGCT ACCTGGATCA GGCTTCAGGG AGTGCCATCA TCGATCTCAC AGAAAATGAC 2040 

CAGGTGTGGC TCCAGCTTCC CAATGCCGAG TCAAATGGCC TATACTCCTC TGAGTATGTC 2100 

CACTCCTCTT TCTCAGGATT CCTAGTGGCT CCAATGTGAG TACACCCCAC AGAGCTAATC 2160 

TAAATCTTGT GCTAGAAAAA GCATTCTCTA ACTCTACCCC ACCCTACAAA ATGCATATGG 2220 

AGGTAGGCTG AAAAGAATGT AATTTTTATT TTCTGAAATA CAGATTTGAG CTATCAGACC 2280 

AACAAACCTT CCCCCTGAAA AGTGAGCAGC AACGTAAAAA CGTATGTGAA GCCTCTCTTG 2340 

AATTTCTAGT TAGCAATCTT AAGGCTCTTT AAGGTTTTCT CCAATATTAA AAAATATCAC 2400 

CAAAGAAGTC CTGCTATGTT AAAAACAAAC AACAAAAAAC AAAGCAACAA AAAAAAAAAT 2460 

TAAAAAAAAA AACAGAAATA GAGCTCTAAG TTATGTGAAA TTTGATTTGA GAAACTCGGC 2520 

ATTTCCTTTT TAAAAAAGCC TGTTTCTAAC TATGAATATG AGAACTTCTA GGAAACATCC 2580 

AGGAGGTATC ATATAACTTT GTAGAACTTA AATACTTGAA TATTCAAATT TAAAAGACAC 2640 

TGTATCCCCT AAAATATTTC TGATGGTGCA CTACTCTGAG GCCTGTATGG CCCCTTTCAT 2700 

CAATATCTAT TCAAATATAC AGGTGCATAT ATACTTGTTA AAGCTCTTAT ATAAA AAAGC 2760 

CCCAAAATAT TGAAGTTCAT CTGAAATGCA AGGTGCTTTC ATCAATGAAC CTTTTCAAAA 2820 

CTTTTCTATG ATTGCAGAGA AGCTTTTTAT ATACCCAGCA TAAC TTGG AA ACAGGTATCT 2880 

GACCTATTCT TATTTAGTTA ACACAAGTGT GATTAATTTG ATTTCTTTAA TTCCTTATTG 2940 

AATCTTATGT GATATGATTT TCTGGATTTA CAGAACATTA GCACATGTAC CTTGTGCCTC 3000 

CCATTCAAGT GAAGTTATAA TTTACACTGA GGGTTTCAAA ATT CG ACT AG AAGTGGAGAT 3060 

ATATTATTTA TTTATGCACT GTACTGTATT TTTATATTGC TGTTTAAAAC TTTTAAGCTG 3120 

TGCCTCACTT ATTAAAGCAC AAAATGTTTT ACCTACTCCT TATTTACGAC ACAATAAAAT 3180 

AACATCAATA GATTTTTAGG CTGAATTAAT TTGAAAGCAG CAATTTGCTG TTCTCAACCA 3240 
TTCTTTCAAG GCTTTTCATT CGACACAATA AAATAACATC AATAG 

Seq ID NO: 435 Protein sequence 
protein Accession #: NPJJ00484.2 

1 11 21 31 41 SI 

| | | I I I 

MLPQIPFLLL VSLNLVKGVF YAERYQMPTG IKGPLPNTKT QFFIPYTIKS KGIAVRGEQG 60 

TPGPPGPAGP RGHPGPSGPP GKPGYGSPGL QGBPGLPGPP GPSAVGKPGV PGLPGKPGER 120 

GPYGPKGDVG PAGLPGPRGP PGPPGIPGPA GISVPGKPGQ QGPTGAPGPR GFPGEKGAPG 180 

VPGMNGQKGB MGYGAPGRPG ERGLPGPQGP TGPSGPPGVG KRGEKGVPGQ PGIKGDRGFP 240 

GEMGPIGPPG PQGPPGERGP EGIGKPGAAG APGQPGIPGT KGLPGAPGIA GPPGPPGFGK 300 

PGLFGLKGER GPAGLPGGPG AKGEQGPAGL PGKPGLTGPP GNMGPQGPKG IPGSHGLPGP 360 

KGETGPAGPA GYPGAKGERG SPGSDGKPGY PGKPGLDGPK GNPGLPGPKG DPGVGGPPGIi 420 

PGPVGPAGAK GMPGHNGEAG PRGAPGIPGT RGPIGPPGIP GFPGSKGDPG SPGPPGPAGI 480 

ATKGLNGPTG PPGPPGPRGH SGEPGLPGPP GPPGPPGQAV MPEGFIKAGQ RPSLSGTPLV 540 

SANQGVTGMP VSAFTVILSK AYPAIGTPIP FDKILYNRQQ HYDPRTGIFT CQIPGIYYFS 600 

YHVHVKGTHV WVGLYKNGTP VMYTYDEYTK GYLDQASGSA IIDLTENDQV WLQLPNAESN 660 
GLYSSEYVHS SFSGFLVAPM 

Seq ID NO: 436 DNA sequence 
Nucleic Acid Accession fh XM_062811 
Coding sequence: 1..888 

1 IX 21 31 41 51 

I I.I I I I 

ATGTGGGGCG CTCGCCGCTC GTCCGTCTCC TCATCCTGGA ACGCCGCTTC GCTCCTGCAG 60 

CTGCTGCTGG CTGCGCTGCT GGCGGCGGGG GCGAGGGCCA GCGGCGAGTA CTGCCACGGC 120 

TGGCTGGACG CGCAGGGCGT CTGGCGCATC GGCTTCCAGT GTCCCGAGCG CTTCGACGGC 180 

GGCGACGCCA CCATCTGCTG CGGCAGCTGC GCGTTGCGCT ACTGCTGCTC CAGCGCCGAG 240 

GCGCGCCTGG ACCAGGGCGG CTGCGACAAT GACCGCCAGC AGGGCGCTGG CGAGCCTGGC 300 

CGGGCGGACA AAGACGGCCC CGACGGCTCG GCAGTGCCCA TCTACGTGCC GTTCCTCATT 360 

GTTGGCTCCG TGTTTGTCGC CTTTATCATC TTGGGGTCCC TGGTGGCAGC CTGTTGCTGC 420 

AGATGTCTCC GGCCTAAGCA GGATCCCCAG CAGAGCCGAG CCCCAGGGGG TAACCGCTTG 480 

ATGGAGACCA TCCCCATGAT CCCCAGTGCC AGCACCTCCC GGGGGTCGTC CTCACGCCAG 540 

TCCAGCACAG CTGCCAGTTC CAGCTCCAGC GCCAACTCAG GGGCCCGGGC GCCCCCAACA 600 

AGGTCACAGA CCAACTGTTG CTTGCCGGAA GGGACCATGA ACAACGTGTA TGTCAACATG 660 

CCCACGAATT TCTCTGTGCT GAACTGTCAG CAGGCCACCC AGATTGTGCC ACATCAAGGG 720 

CAGTATCTGC ATCCCCCATA CGTGGGGTAC ACGGTGCAGC ACGACTCTGT GCCCATGACA 780 

GCTGTGCCAC CTTTCATGGA CGGCCTGCAG CCTGGCTACA GGCAGATTCA GTCCCCCTTC 840 
CCTCACACCA ACAGTGAACA GAAGATGTAC CCAGCGGTGA CTGTATAA 

Seq ID NO: 437 Protein sequence 
Protein Accession ft: XP_062811 

1 n 21 31 41 51 

I I I I I I 

MWGARRSSVS SSWNAASLLQ LLtiAALLAAG ARASGEYCHG WLDAQGVWRI GFQCPERFDG 60 

GDATICCGSC ALRYCCSSAE ARLDQGGCDN DRQQGAGEPG RADKDGPDGS AVPIYVPFLI 120 

VGSVFVAFII LGSLVAACCC RCLRPKQDPQ QSRAPGGNRL METIPWIPSA STSRGSSSRQ 180 

SSTAASSSSS ANSGARAPPT RSQTNCCLPE GTMNNVYVNM PTNFSVLNCQ QATQIVPHQG 240 
QYLHPPYVGY TVQHDSVPMT AVPPFMDGW? PGYRQIQSPF PHTNSEQKMY PAVTV 

Seq ID NO: 438 DNA sequence 

Nucleic Acid Accession ft: NM_004004.1 

Coding sequence: 1..661 

1 11 21 31 41 51 

| | I I I I 

ATGGATTGGG GCACGCTGCA GACGATCCTG GGGGGTGTGA ACAAACACTC CACCAGCATT 60 

GGAAAGATCT GGCTCACCGT CCTCTTCATT TTTCGCATTA TGATCCTCGT TGTGGCTGCA 120 

AAGGAGGTGT GGGGAGATGA GCAGGCCGAC TTTGTCTGCA ACACCCTGCA GCCAGGCTGC 180 
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AAGAACGTGT GCTACGATCA CTACTTCCCC ATCTCCCACA TCCGGCTATG GGCCCTGCAQ 240 

CTGATCTTCG TGTCCAGCCC A6CGCTCCTA GTGGCCATGC ACGTGGCCTA CCGGAGACAT 300 

GAGAAGAAGA GGAAGTTCAT CAAGGGGGAG ATAAAGAGTG AATTTAAGGA CATCGAGGAG 360 

ATCAAAACCC AGAAGGTCOG CATCGAAGGC TCCCTGTGGT GGACCTACAC AAGCAGCATC 420 

TTCTTCCGGG TCATCTTCGA AGCCGCCTTC ATGTACGTCT TCTATGTCAT GTACGACGGC 480 

TTCTCCATGC AGCGGCTGGT GAAGTGCAAC GCCTGGCCTT GTCCCAACAC TGTGGACTGC 540 

TTTGTGTCCC GGCCCACGGA GAAGACTGTC TTCACAGTGT TCATGATTGC AGTGTCTGGA 600 

ATTTGCATCC TGCTGAATGT CACTGAATTG TGTTATTTGC TAATTAGATA TTGTTCTGGG 660 
AAGTCAAAAA AGCCAGTTTA A 

Seq ID NO: 439 Protein sequence 
Protein Accession #: NP_0 03 995.1 

1 ll 21 31 41 51 

I I I I I I 

MDWGTLQTIL GGVNKHSTSI GKIWLTVIiFI FRIMILWAA KEVWGDEQAD FVCNTLQPGC 60 
KNVCYDHYFP ISHIRLWALQ LIFVSSPALL VAMHVAYRRH EKKRKFIKGE IKSEFKDIEE 120 
IKTQKVRIEG SLWWTYTSSI FFRVIFBAAF MYVFYVMYDG FSMQRLVKCN AWPCPNTVDC 180 
FVSRPTEKTV FTVFMIAVSG ICIXiLNVTEL CYLLIRYCSG KSKKPV 



Seq ID NO: 440 DNA sequence 

Nucleic Acid Accession 8: XM_061091.1 

Coding sequence: 1..2481 

1 11 21 31 41 51 

ATGCCAAATA CTTCAGGAAC AACCAGGATT GAAATTTGGC TTCTCCAAGA GCCGCCCGGG 60 

CACCGAGCGC TGGTCGCCGC TCTCCTTCCG GTGAGTCCCA GCCCCGAGTT GGCTCTGGCG 120 

CCCGGGTACC CGCCAGTGCC GGCTGCCGAT GACCGATTCA CGCTCCCGAT GATTGGAGGT 180 

CAGATGCATG GTGAGAAGGT AGATCTCTGG AGCCTTGGTG TTCTTTGCTA TGAATTTTTA 240 

GTTGGGAAGC CTCCTTTTGA GGCAAACGAA GTCCATGTAA GCAAAGAAAC CATCGGGAAG 300 

ATTTCAGCTG CCAGCAAAAT GATGTGGTGC TCGGCTGCAG TGGACATCAT GTTTCTGTTA 360 

GATGGGTCTA ACAGCGTCGG GAAAGGGAGC TTTGAAAGGT CCAAGCACTT TGCCATCACA 420 

GTCTGTGACG GTCTGGACAT CAGCCCCGAG AGGGTCAGAG TGGGAGCATT CCAGTTCAGT 480 

TCCACTCCTC ATCTGGAATT CCCCTTGGAT TCATTTTCAA CCCAACAGGA AGTGAAGGCA 540 

AGAATCAAGA GGATGGTTTT CAAAGGAGGG CGCACGGAGA CGGAACTTGC TCTGAAATAC 600 

CTTCTGCACA GAGGGTTGCC TGGAGGCAGA AATGCTTCTG TGCCCCAGAT CCTCATCATC 660 

GTCACTGATG GGAAGTCCCA GGGGGATGTG GCACTGCCAT CCAAGCAGCT GAAGGAAAGG 720 

GGTGTCACTG TGTTTGCTGT GGGGGTCAGG TTTCCCAGGT GGGAGGAGCT GCATGCACTG 780 

GCCAGCGAGC CTAGAGGGCA GCACGTGCTG TTGGCTGAGC AGGTGGAGGA TGCCACCAAC 840 

GGCCTCTTCA GCACCCTCAG CAGCTCGGCC ATCTGCTCCA GCGCCACGCC AGCTGGGAGC 900 

CCCGAGCTTG TCTTCATGGA GCGGTTAATG GGCATCTCTC TGATAGGCCC CTGTGACTCG 960 

CAGCCCTGCC AGAATGGAGG CACATGTGTT CCAGAAGGAC TGGACGGCTA CCAGTGCCTC 1020 

TGCCCGCTGG CCTTTGGAGG GGAGGCTAAC TGTGCCCTGA AGCTGAGCCT GGAATGCAGG 1080 

GTCGACCTCC TCTTCCTGCT GGACAGCTCT GCGGGCACCA CTCTGGACGG CTTCCTGCGG 1140 

GCCAAAGTCT TCGTGAAGCG GTTTGTGCGG GCCGTGCTGA GCGAGGACTC TCGGGCCCGA 1200 

GTGGGTGTGG CCACATACAG CAGGGAGCTG CTGGTGGCGG TGCCTGTGGG GGAGTACCAG 1260 

GATGTGCCTG ACCTGGTCTG GAGCCTCGAT GGCATTCCCT TCCGTGGTGG CCCCACCCTG 1320 

ACGGGCAGTG CCTTGCGGCA GGCGGCAGAG CGTGGCTTCG GGAGCGCCAC CAGGACAGGC 1380 

CAGGACCGGC CACGTAGAGT GGTGGTTTTG CTCACTGAGT CACACTCCGA GGATGAGGTT 1440 

GCGGGCCCAG CGCGTCACGC AAGGGCGCGA GAGCTGCTCC TGCTGGGTGT AGGCAGTGAG 1500 

GCCGTGCGGG CAGAGCTGGA GGAGATCACA GGCAGCCCAA AGCATGTGAT GGTCTACTCG 1560 

GATCCTCAGG ATCTGTTCAA CCAAATCCCT GAGCTGCAGG GGAAGCTGTG CAGCCGGCAG 1620 

CGGCCAGGGT GCCGGACACA AGCCCTGGAC CTCGTCTTCA TGTTGGACAC CTCTGCCTCA 1680 

GTAGGGCCCG AGAATTTTGC TCAGATGCAG AGCTTTGTGA GAAGCTGTGC CCTCCAGTTT 1740 

GAGGTGAACC CTGACGTGAC ACAGGTCGGC CTGGTGGTGT ATGGCAGCCA GGTGCAGACT 1800 

GCCTTCGGGC TGGACACCAA ACCCACCCGG GCTGCGATGC TGCGGGCCAT TAGCCAGGCC 1860 

CCCTACCTAG GTGGGGTGGG CTCAGCCGGC ACCGCCCTGC TGCACATCTA TGACAAAGTG 1920 

ATGACCGTCC AGAGGGGTGC CCGGCCTGGT GTCCCCAAAG CTGTGGTGGT GCTCACAGGC 1980 

GGGAGAGGCG CAGAGGATGC AGCCGTTCCT GCCCAGAAGC TGAGGAACAA TGGCATCTCT 2040 

GTCTTGGTCG TGGGCGTGGG GCCTGTCCTA AGTGAGGGTC TGCGGAGGCT TGCAGGTCCC 2100 

CGGGATTCCC TGATCCACGT GGCAGCTTAC GCCGACCTGC GGTACCACCA GGACGTGCTC 2160 

ATTGAGTGGC TGTGTGGAGA AGCCAAGCAG CCAGTCAACC TCTGCAAACC CAGCCCGTGC 2220 

ATGAATGAGG GCAGCTGCGT CCTGCAGAAT GGGAGCTACC GCTGCAAGTG TCGGGATGGC 2280 

TGGGAGGGCC CCCACTGCGA GAACCGTGAG TGGAGCTCTT GCTCTGTATG TGTGAGCCAG 2340 

GGATGGATTC TTGAGACGCC CCTGAGGCAC ATGGCTCCCG TGCAGGAGGG CAGCAGCCGT 2400 

ACCCCTCCCA GCAACTACAG AGAAGGCCTG GGCACTGAAA TGGTGCCTAC CTTCTGGAAT 2460 
GTCTGTGCCC CAGGTCCTTA G 

Seq ID NO: 441 Protein sequence 
Protein Accession #: XP_061091.1 

1 11 21 31 41 51 

III III 

MPNTSGTTRI EIWLLQEPPG HRALVAALLP VSPSPELALA PGYPPVPAAD DRFTLPMIGG 60 

QMHGEKVDLW ' SLGVLCYEFL VGKPPFEANE VHVSKETIGK ISAASKMMWC SAAVDIMFLL 120 

DGSNSVGKGS FERSKHFAIT VCDGLDISPE RVRVGAFQFS STPHLEFPLD SFSTQQEVKA 180 

RIKRMVFKGG RTETELALKY LLHRGLPGGR NASVPQILII VTDGKSQGDV ALPSKQLKER 240 

GVTVFAVGVR FPRWEELHAL ASEPRGQHVL LAEQVEDATN GLFSTLSSSA ICSSATPAGS 300 

PELVFMERLM GISLIGPCDS QPCQNGGTCV PEGLDGYQCL CPLAFGGEAN CALKLSLECR 360 

VDLLFLLDSS AGTTLDGFLR AKVFVKRFVR AVLSEDSRAR VGVATYSREL LVAVPVGEYQ 420 

DVPDLVWSLD GIPFRGGPTL TGSALRQAAE RGFGSATRTG QDRPRRVWL LTESHSEDEV 480 

AGPARHARAR ELLLLGVGSE AVRAELEEIT GSPKHVMVYS DPQDLFNQIP ELQGKLCSRQ 540 

RPGCRTQALD LVFMLDTSAS VGPENFAQMQ SFVRSCALQF EVNPDVTQVG LWYGSQVQT 600 

AFGLDTKPTR AAMLRAISQA PYLGGVGSAG TALLHIYDKV MTVQRGARPG VPKAWVLTG 660 

GRGAEDAAVP AQKLRNNGIS VLWGVGPVL SEGLRRLAGP RDSLIHVAAY ADLRYHQDVL 720 
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IEWLCGEAKQ PVNLCKPSPC MNEGSCVLQN GSYRCKCRDG WEGPHCENRE WSSCSVCVSQ 780 
GWILETPLRH MAPVQEGSSR TPPSNYREGL GTEMVPTFWN VCAPGP 

Seq ID NO: 442 DNA sequence 

Nucleic Acid Accession fi: Eos sequence 

Coding sequence: 1..2424 

1 11 21 31 41 51 

I I I I I I 

ATGCCCCCTT TCCTGTTGCT GGAGGCCGTC TGTGTTTTCC TGTTTTCCAG AGTGCCCCCA 60 

TCTCTCCCTC TCCAGGAAGT CCATGTAAGC AAAGAAACCA TCGGGAAGAT TTCAGCTGCC 120 

AGCAAAATGA TGTGGTGCTC GGCTGCAGTG GACATCATGT TTCTGTTAGA TGGGTCTAAC 180 

AGCGTCGGGA AAGGGAGCTT TGAAAGGTCC AAGCACTTTG CCATCACAGT CTGTGACGGT 240 

CTGGACATCA GCCCCGAGAG GGTCAGAGTG GGAGCATTCC AGTTCAGTTC CACTCCTCAT 300 

CTGGAATTCC CCTTGGATTC ATTTTCAACC CAACAGGAAG TGAAGGCAAG AATCAAGAGG 360 

ATGGTTTTCA AAGGAGGGCG CACGGAGACG GAACTTGCTC TGAAATACCT TCTGCACAGA 420 

GGGTTGCCTG GAGGCAGAAA TGCTTCTGTG CCCCAGATCC TCATCATCGT CACTGATGGG 480 

AAGTCCCAGG GGGATGTGGC ACTGCCATCC AAGCAGCTGA AGGAAAGGGG TGTCACTGTG 540 

TTTGCTGTGG GGGTCAGGTT TCCCAGGTGG GAGGAGCTGC ATGCACTGGC CAGCGAGCCT 600 

AGAGGGCAGC ACGTGCTGTT GGCTGAGCAG GTGGAGGATG CCACCAACGG CCTCTTCAGC 660 

ACCCTCAGCA GCTCGGCCAT CTGCTCCAGC GCCACGCCAG ACTGCAGGGT CGAGGCTCAC 720 

CCCTGTGAGC ACAGGACGCT GGAGATGGTC CGGGAGTTCG CTGGCAATGC CCCATGCTGG 780 

AGAGGATCGC GGCGGACCCT TGCGGTGCTG GCTGCACACT GTCCCTTCTA CAGCTGGAAG 840 

AGAGTGTTCC TAACCCACCC TGCCACCTGC TACAGGACCA CCTGCCCAGG CCCCTGTGAC 900 

TCGCAGCCCT GCCAGAATGG AGGCACATGT GTTCCAGAAG GACTGGACGG CTACCAGTGC 960 

CTCTGCCCGC TGGCCTTTGG AGGGGAGGCT AACTGTGCCC TGAAGCTGAG CCTGGAATGC 1020 

AGGGTCGACC TCCTCTTCCT GCTGGACAGC TCTGCGGGCA CCACTCTGGA CGGCTTCCTG 1080 

CGGGCCAAAG TCTTCGTGAA GCGGTTTGTG CGGGCCGTGC TGAGCGAGGA CTCTCGGGCC 1140 * 

CGAGTGGGTG TGGCCACATA CAGCAGGGAG CTGCTGGTGG CGGTGCCTGT GGGGGAGTAC 1200 

CAGGATGTGC CTGACCTGGT CTGGAGCCTC GATGGCATTC CCTTCCGTGG TGGCCCCACC 1260 

CTGACGGGCA GTGCCTTGCG GCAGGCGGCA GAGCGTGGCT TOGGGAGCGC CACCAGGACA 1320 

GGCCAGGACC GGCCACGTAG AGTGGTGGTT TTGCTCACTG AGTCACACTC CX3AGGATGAG 1380 

GTTGCGGGCC CAGCGCGTCA CGCAAGGGCG CGAGAGCTGC TCCTGCTGGG TGTAGGCAGT 1440 

GAGGCCGTGC GGGCAGAGCT GGAGGAGATC ACAGGCAGCC CAAAGCATGT GATGGTCTAC 1500 

TCGGATCCTC AGGATCTGTT CAACCAAATC CCTGAGCTGC AGGGGAAGCT GTGCAGCCGG 1560 

CAGCGGCCAG GGTGCCGGAC ACAAGCCCTG GACCTCGTCT TCATGTTGGA CACCTCTGCC 162 0 

TCAGTAGGGC CCGAGAATTT TGCTCAGATG CAGAGCTTTG TGAGAAGCTG TGCCCTCCAG 1680 

TTTGAGGTGA ACCCTGACGT GACACAGGTC GGCCTGGTGG TGTATGGCAG CCAGGTGCAG 1740 

ACTGCCTTCG GGCTGGACAC CAAACCCACC CGGGCTGCGA TGCTGCGGGC CATTAGCCAG 1800 

GCCCCCTACC TAGGTGGGGT GGGCTCAGCC GGCACCGCCC TGCTGCACAT CTATGACAAA 1860 

GTGATGACCG TCCAGAGGGG TGCCCGGCCT GGTGTCCCCA AAGCTGTGGT GGTGCTCACA 1920 

GGCGGGAGAG GCGCAGAGGA TGCAGCCGTT CCTGCCCAGA AGCTGAGGAA CAATGGCATC 1980 

TCTGTCTTGG TCGTGGGCGT GGGGCCTGTC CTAAGTGAGG GTCTGCGGAG GCTTGCAGGT 2040 

CCCCGGGATT CCCTGATCCA CGTGGCAGCT TACGCCGACC TGCGGTACCA CCAGGACGTG 2100 

CTCATTGAGT GGCTGTGTGG AGAAGCCAAG CAGCCAGTCA ACCTCTGCAA ACCCAGCCCG 2160 

TGCATGAATG AGGGCAGCTG CGTCCTGCAG AATGGGAGCT ACCGCTGCAA GTGTCGGGAT 2220 

GGCTGGGAGG GCCCCCACTG CGAGAACCGT GAGTGGAGCT CTTGCTCTGT ATGTGTGAGC 2280 

CAGGGATGGA TTCTTGAGAC GCCCCTGAGG CACATGGCTC CCGTGCAGGA GGGCAGCAGC 2340 

CGTACCCCTC CCAGCAACTA CAGAGAAGGC CTGGGCACTG AAATGGTGCC TACCTTCTGG 2400 
AATGTCTGTG CCCCAGGTCC TTAG 

Seq ID NO: 443 Protein sequence 
Protein Accession #: Eos sequence 

1 11 21 31 41 51 

I I 1 I I I 

MPPFLLLEAV CVFDFSRVPP SLPLQEVHVS KETIGKISAA SKMMWCSAAV DIMFLLDGSN 60 

SVGKGSFERS KHFAITVCDG LDISPERVRV GAFQFSSTPH LEFPLDSFST QQEVKARIKR 120 

MVFKGGRTET ELALKYLLHR GLPGGRNASV PQILIIVTDG KSQGDVALPS KQLKERGVTV 180 

FAVGVRFPRW EELHALASEP RGQHVLLAEQ VEDATNGLFS TLSSSAICSS ATPDCRVEAH 240 

PCEHRTLEMV REFAGNAPCW RGSRRTLAVL AAHCPFYSWK RVFLTHPATC YRTTCPGPCD 300 

SQPCQNGGTC VPEGLDGYQC LCPLAFGGEA NCALKLSLEC RVDLLFMjDS SAGTTLDGFL 360 

RAKVFVKRFV RAVLSEDSRA RVGVATYSRE LLVAVPVGEY QDVPDLVWSL DGIPFRGGPT 420 

LTGSALRQAA ERGFGSATRT GQDRPRRVW LLTESHSEDE VAGPARHARA RELLLLGVGS 480 

EAVRAELEEI TGSPKHVMVY SDPQDLFNQI PELQGKLCSR QRPGCRTQAL DLVFMLDTSA 540 

SVGPENFAQM QSFVRSCALQ FEVNPDVTQV GLWYGSQVQ TAFGLDTKPT RAAMLRAISQ 600 

APYLGGVGSA GTALLHIYDK VMTVQRGARP GVPKAVWLT GGRGAEDAAV PAQKLRNNGI 660 

SVLWGVGPV LSEGLRRLAG PRDSL.IHVAA YADLRYHQDV LIEWLCGEAK QPVNLCKPSP 720 

CMNEGSCVLQ NGSYRCKCRD GWEGPHCENR EWSSCSVCVS QGWILETPLR HMAPVQEGSS 780 
RTPPSNYREG LGTEMVPTFW NVCAPGP 

Seq ID NO: 444 DNA sequence 

Nucleic Acid Accession fh Eos sequence 

Coding sequence t 89. .2356 

1 11 21 31 41 . 51 

11)111 

GCCCCCTGGC CCGAGCCGCG CCCGGGTCTG TGAGTAGAGC CGCCCGGGCA CCGAGCGCTG 60 

GTCGCCGCTC TCCTTCCGTT ATATCAACAT GCCCCCTTTC CTGTTGCTGG AAGCCGTCTG 120 

TGTTTTCCTG TTTTCCAGAG TGCCCCCATC TCTCCCTCTC CAGGAAGTCC ATGTAAGCAA 180 

AGAAACCATC GGGAAGATTT CAGCTGCCAG CAAAATGATG TGGTGCTCGG CTGCAGTGGA 240 

CATCATGTTT CTGTTAGATG GGTCTAACAG CGTCGGGAAA GGGAGCTTTG AAAGGTCCAA 300 

GCACTTTGCC ATCACAGTCT GTGACGGTCT GGACATCAGC CCCGAGAGGG TCAGAGTGGG 360 

AGCATTCCAG TTCAGTTCCA CTCCTCATCT GGAATTCCCC TTGGATTCAT TTTCAACCCA 420 

ACAGGAAGTG AAGGCAAGAA TCAAGAGGAT GGTTTTCAAA GGAGGGCGCA CGGAGACGGA 480 

ACTTGCTCTG AAATACCTTC TGCACAGAGG GTTGCCTGGA GGCAGAAATG CTTCTGTGCC 540 

CCAGATCCTC ATCATCGTCA CTGATGGGAA GTCCCAGGGG GATGTGGCAC TGCCATCCAA 600 
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GCAGCTGAAG GAAAGGG6TG TCACTGTGTT TGCTGTGGGG GTCAGGTTTC CCAGGTGGGA 660 

GGAGCTGCAT GGACTGGCCA GCGAGCCTAG AGGGCAGCAC GTGCTGTTGG CTGAGCAGGT 720 

GGAGGATGCC ACCAACGGCC TCTTCAGCAC CCTCAGCAGC TCGGCCATCT GCTCCAGCGC 780 

CACGCCAGAC TGCAGGGTCG AGGCTCACCC CTGTGAGCAC AGGACGCTGG AGATGGTCOG 840 

GGAGTTCGCT GGCAATGCCC CATGCTGGAG AGGATCGCGG CGGACCCTTG CGGTGCTGGC 900 

TGCACACTGT CCCTTCTACA GCTGGAAGAG AGTGTTCCTA ACCCACCCTG CCACCTGCTA 960 

CAGGACCACC TGCCCAGGCC CCTGTGACTC GCAGCCCTGC CAGAATGGAG GCACATGTGT 1020 

TCCAGAAGGA CTGGACGGCT ACCAGTGCCT CTGCCCGCTG GCCTTTGGAG GGGAGGCTAA 1080 

CTGTGCCCTG AAGCTGAGCC TGGAATGCAG GGTCGACCTC CTCTTCCTGC TGGACAGCTC 1140 

TGCGGGCACC ACTCTGGACG GCTTCCTGCG GGCCAAAGTC TTCGTGAAGC GGTTTGTGCG 1200 

GGCCGTGCTG AGCGAGGACT CTCGGGCCCG AGTGGGTGTG GCCACATACA GCAGGGAGCT 1260 

GCTGGTGGCG GTGCCTGTGG GGGAGTACCA GGATGTGCCT GACCTGGTCT GGAGCCTCGA 1320 

TGGCATTCCC TTCCGTGGTG GCCCCACCCT GACGGGCAGT GCCTTGCGGC AGGCGGCAGA 1380 

GCGTGGCTTC GGGAGCGCCA CCAGGACAGG CCAGGACCGG CCACGTAGAG TGGTGGTTTT 1440 

GCTCACTGAG TCACACTCCG AGGATGAGGT TGCGGGCCCA GOGCGTCACG CAAGGGCGCG 1500 

AGAGCTGCTC CTGCTGGGTG TAGGCAGTGA GGCCGTGCGG GCAGAGCTGG AGGAGATCAC 1560 

AGGCAGCCCA AAGCATGTGA TGGTCTACTC GGATCCTCAG GATCTGTTCA ACCAAATCCC 1620 

TGAGCTGCAG GGGAAGCTGT GCAGCCGGCA GCGGCCAGGG TGCCGGACAC AAGCCCTGGA 1680 

CCTCGTCTTC ATGTTGGACA CCTCTGCCTC AGTAGGGCCC GAGAATTTTG CTCAGATGCA 1740 

GAGCTTTGTG AGAAGCTGTG CCCTCCAGTT TGAGGTGAAC CCTGACGTGA CACAGGTCGG 1800 

CCTGGTGGTG TATGGCAGCC AGGTGCAGAC TGCCTTCGGG CTGGACACCA AACCCACCCG 1860 

GGCTGCGATG CTGCGGGCCA TTAGCCAGGC CCCCTACCTA GGTGGGGTGG GCTCAGCCGG 1920 

CACCGCCCTG CTGCACATCT ATGACAAAGT GATGACCGTC CAGAGGGGTG CCCGGCCTGG 1980 

TGTCCCCAAA GCTGTGGTGG TGCTCACAGG CGGGAGAGGC GCAGAGGATG CAGCCGTTCC 2040 

TGCCCAGAAG CTGAGGAACA ATGGCATCTC TGTCTTGGTC GTGGGCGTGG GGCCTGTCCT 2100 

AAGTGAGGGT CTGCGGAGGC TTGCAGGTCC CCGGGATTCC CTGATCCACG TGGCAGCTTA 2160 

CGCCGACCTG CGGTACCACC AGGACGTGCT CATTGAGTGG CTGTGTGGAG AAGCCAAGCA 2220 

GCCAGTCAAC CTCTGCAAAC CCAGCCCGTG CATGAATGAG GGCAGCTGCG "TCCTGCAGAA 2280 

TGGGAGCTAC CGCTGCAAGT GTCGGGATGG CTGGGAGGGC CCCCACTGCG AGAACCGATT 2340 

CTTGAGACGC CCCTGAGGCA CATGGCTCCC GTGCAGGAGG GCAGCAGCCG TACCCCTCCC 2400 

AGCAACTACA GAGAAGGCCT GGGCACTGAA ATGGTGCCTA CCTTCTGGAA TGTCTGTGCC 2460 

CCAGGTCCTT AGAATGTCTG CTTCCCGCCG TGGCCAGGAC CACTATTCTC ACTGAGGGAG 2520 

GAGGATGTCC CAACTGCAGC CATGCTGCTT AGAGACAAGA AAGCAGCTGA TGTCACCCAC 2580 

AAACGATGTT GTTGAAAAGT TTTGATGTGT AAGTAAATAC CCACTTTCTG TACCTGCTGT 2640 

GCCTTGTTGA GGCTATGTCA TCTGCCACCT TTCCCTTGAG GATAAACAAG GGGTCCTGAA 2700 

GACTTAAATT TAGCGGCCTG ACGTTCCTTT GCACACAATC AATGCTCGCC AGAATGTTGT 2760 
TGACACAGTA ATGCCCAGCA GAGGCCTTTA CTAGAGCATC CTTTGGACGG 

Seq ID NO: 445 Protein sequence 
Protein Accession #: Eos sequence 

1 11 21 31 41 51 

I I I I I I 

MPPPLLLEAV CVPLPSRVPP SLPLQEVHVS KETIGKISAA SKMMWCSAAV DIMFLLDGSN 60 

SVGKGSFERS KHFAITVCDG LDISPERVRV GAFQFSSTPH LEFPLDSFST QQEVKARIKR 120 

MVFKGGRTET ELALKYLLHR GLPGGRNASV PQILIIVTDG KSQGDVALPS KQLKERGVTV 180 

FAVGVRFPRW EELHALASEP RGQHVLLAEQ VEDATNGLFS TLSSSAICSS ATPDCRVEAH 240 

PCEHRTLEMV REFAGNAPCW RGSRRTLAVL AAHCPFYSWK RVFLTHPATC YRTTCPGPCD 300 

SQPCQNGGTC VPEGLDGYQC LCPLAFGGEA NCALKLSLEC RVDLLFLLDS SAGTTLDGFL 360 

RAKVFVKRFV RAVLSEDSRA RVGVATYSRE LLVAVPVGEY QDVPDLVWSL DGIPFRGGPT 420 

LTGSALRQAA ERGFGSATRT GQDRPRRVW LLTESHSEDE VAGPARHARA RELLLLGVGS 480 

EAVRAELEEI TGSPKHVMVY SDPQDLFNQI PELQGKLCSR QRPGCRTQAL DLVFMU3TSA 540 

SVGPENFAQM QSFVRSCALQ FEVNPDVTQV GLWYGSQVQ TAFGLDTKPT RAAMLRAISQ 600 

APYLGGVGSA GTALLHIYDK VMTVQRGARP GVPKAWVLT GGRGAEDAAV PAQKLRNNGI 660 

SVLWGVGPV LSEGLRRLAG PRDSLIHVAA YADLRYHQDV LIEWLCGEAK QPVNLCKPSP 720 
CMNEGSCVLQ NGSYRCKCRD GWEGPHCENR FLRRP 

Seq ID MO: 446 DNA sequence 

Nucleic Acid Accession fh NMJJ31942.1 . 

Coding sequence: 145.. 1260 

1 11 21 31 41 51 

I I I I I I 

CCCGAGCCCC GCCCCTCCGG GCCCGGGTCG GCGCGCCCAG CCTGCCAGCC GCGCTGCTGC 60 

TGCTCCTCCT GCTGTGGGAC CGCTGACCGC GCGGCTGCTC CGCTCTCCCC GCTCCAAGCG 120 

CCGATCTGGG CACCCGCCAC CAGCATGGAC GCTCGCCGCG TGCCGCAGAA AGATCTCAGA 180 

GTAAAGAAGA ACTTAAAGAA ATTCAGATAT GTGAAGTTGA TTTCCATGGA AACCTCGTCA 240 

TCCTCTGATG ACAGTTGTGA CAGCTTTGCT TCTGATAATT TTGCAAACAC GAGGCTGCAG 300 

TCAGTTCGGG AAGGCTGTAG GACCCGCAGC CAGTGCAGGC ACTCTGGACC TCTCAGGGTG 360 

GCGA7GAAGT TTCCAGCGCG GAGTACCAGG GGAGCAACCA ACAAAAAAGC AGAGTCCCGC 420 

CAGCCCTCAG AGAATTCTGT GACTGATTCC AACTCCGATT CAGAAGATGA AAGTGGAATG 480 

AATTTTTTGG AGAAAAGGGC TTTAAATATA AAGCAAAACA AAGCAATGCT TGCAAAACTC 540 

ATGTCTGAAT TAGAAAGCTT CCCTGGCTCG TTCCGTGGAA GACATCCCCT CCCAGGCTCC 600 

GACTCACAAT CAAGGAGACC GCGAAGGCGT ACATTCCCGG GTGTTGCTTC CAGGAGAAAC 660 

CCTGAACGGA GAGCTCGTCC TCTTACCAGG TCAAGGTCCC GGATCCTCGG GTCCCTTGAC 720 

GCTCTACCCA TGGAGGAGGA GGAGGAAGAG GATAAGTACA TGTTGGTGAG AAAGAGGAAG 780 

ACCGTGGATG GCTACATGAA TGAAGATGAC CTGCCCAGAA GCCGTCGCTC CAGATCATCC 840 

GTGACCCTTC CGCATATAAT TCGCCCAGTG GAAGAAATTA CAGAGGAGGA GTTGGAGAAC 900 

GTCTGCAGCA ATTCTCGAGA GAAGATATAT AACCGTTCAC TGGGCTCTAC TTGTCATCAA 960 

TGCCGTCAGA AGACTATTGA TACCAAAACA AACTGCAGAA ACCCAGACTG CTGGGGCGTT 1020 

CGAGGCCAGT TCTGTGGCCC CTGCCTTCGA AACCGTTATG GTGAAGAGGT CAGGGATGCT 1080 

CTGCTGGATC CGAACTGGCA TTGCCCGCCT TGTCGAGGAA TCTGCAACTG CAGTTTCTGC 1140 

CGGCAGCGAG ATGGACGGTG TGCGACTGGG GTCCTTGTGT ATTTAGCCAA ATATCATGGC 1200 

TTTGGGAATG TGCATGCCTA CTTGAAAAGC CTGAAACAGG AATTTGAAAT GCAAGCATAA 1260 

TATCTGGAAA ATTTGCTGCC TGCCTTCTAC TTCTCAAATC TTTCTTGTAA AAGTTTCCAA 1320 

TTTTTTCACT GAAACCTGAG TTAAAAATCT TGATGATCAG CCTGTTTCAT AAGAAACTCC 1380 

AATCAAGTTA ATCTTAGCAG ACATGTGTTT CTGGAGCATC ACAGAAGGTA TATTGCTAGT 1440 
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TACACTTTGC CCTCCTGCAG TTTCTTCTCT GCTCCCAACC 
TCTATTTCCA ATGCTCCTCT CCAACCGCTT AGTTTCTGAA 
TATGAAAGCA TATTTTATTT ACTTGGTGTT GAAATAGCCC 
GAAACACAAT AATAGTATTA ACTAACTAGA TCTATTGAAT 
CTTGTTTACA CAAAAACGAG TATGATTTAG CACTCATACT 
TCAAGGCACA AAAGTCTTAA AACCATGTGG AAAAATTAGG 
CTCTCAATCC CATGTATTGC GCTTATGTTA CAAGTTGTTG 
CTCCTAATTT CTTCTGCCCG AAGGGTAAGT GGTGCGTCCA 
AAGGTTGGTG GGCAATGTAA TACTTAATTA AAATAATGAT 
ATGAGTAAGC TGATTTGAAT TTTCAGTATA AAACTTTAGT 
TTATTTCAGT TCACATGTAA GGTATTGCAA ATAAATTCTT 
TTGATATTAA AAACTAGTCT GTGGTTCTTT GCAGTTTCTT 
CAAGGTTCAA GTTTAGATTT TAAGCACTTT TATAACAATG 
GTAACTTTTA GCAGTTTGTT AACCTGACAT CTCTGCCAGT 
CTGTGTCAGT ATTCCCCCTC CTCTTTGCAT TAATCAAGGT 
AAGTGTTTGT ATGTCCAATT TACTTGCATA TGTAAACCAT 
GATGCATAAT TGGACCTTGA ATCGATAAGT GTAAATACAG 
TATACAAAAG TTTATTTTAA TAATAAAATG TTTGTTCTAA 

Seq ID NO: 447 Protein sequence 
Protein Accession #: NP_H414B.l 

1 11 21 

I I I 

MDARRVPQKD LRVKKNliKKF RYVKL.ISMET 

RSQCRHSGPL RVAMKFPARS TRGATNKKAE 

NIKQNKAMIA KLMSELESFP GSFRGRHPLP 

TRSRSRILGS LDALPMEEEE EEDKYMLVRK 

PVEEITEEEL ENVCSNSREK IYNRSLGSTC 

LiRNRYGEEVR DALLDPNWHC PPCRGICNCS 

KSLKQEFEMQ A 

Seq ID NO: 448 DNA sequence 
Nucleic Acid Accession #: NM_019894 
Coding sequence: 1..1314 

1 11 21 31 41 51 

I I I I I I 

ATGTTACAGG ATCCTGACAG TGATCAACCT CTGAACAGCC TCGATGTCAA ACCCCTGCGC 60 
AAACCCCGTA TCCCCATGGA GACCTTCAGA AAGGTGGGGA TCCCCATCAT CATAGCACTA 120 
CTGAGCCTGG CGAGTATCAT CATTGTGGTT GTCCTCATCA AGGTGATTCT GGATAAATAC 180 
TACTTCCTCT GCGGGCAGCC TCTCCACTTC ATCCCGAGGA AGCAGCTGTG TGACGGAGAG 240 
CTGGACTGTC CCTTGGGGGA GGACGAGGAG CACTGTGTCA AGAGCTTCCC CGAAGGGCCT 300 
GCAGTGGCAG TCCGCCTCTC CAAGGACCGA TCCACACTGC AGGTGCTGGA CTCGGCCACA 360 
GGGAACTGGT TCTCTGCCTG TTTCGACAAC TTCACAGAAG CTCTCGCTGA GACAGCCTGT 420 
AGGCAGATGG GCTACAGCAG CAAACCCACT TTCAGAGCTG TGGAGATTGG CCCAGACCAG 480 
GATCTGGATG TTGTTGAAAT CACAGAAAAC AGCCAGGAGC TTCGCATGCG GAACTCAAGT 540 
GGGCCCTGTC TCTCAGGCTC CCTGGTCTCC CTGCACTGTC TTGCCTGTGG GAAGAGCCTG 600 
AAGACCCCCC GTGTGGTGGG TGGGGAGGAG GCCTCTGTGG ATTCTTGGCC TTGGCAGGTC 660 
AGCATCCAGT ACGACAAACA GCACGTCTGT GGAGGGAGCA TCCTGGACCC CCACTGGGTC 720 
CTCACGGCAG CCCACTGCTT CAGGAAACAT ACCGATGTGT TCAACTGGAA GGTGCGGGCA 780 
GGCTCAGACA AACTGGGCAG CTTCCCATCC CTGGCTGTGG CCAAGATCAT CATCATTGAA 840 
TTCAACCCCA TGTACCCCAA AGACAATGAC ATCGCCCTCA TGAAGCTGCA GTTCCCACTC 900 
ACTTTCTCAG GCACAGTCAG GCCCATCTGT CTGCCCTTCT TTGATGAGGA GCTCACTCCA 960 
GCCACCCCAC TCTGGATCAT TGGATGGGGC TTTACGAAGC AGAATGGAGG GAAGATGTCT 1020 
GACATACTGC TGCAGGCGTC AGTCCAGGTC ATTGACAGCA CACGGTGCAA TGCAGACGAT 1080 
GCGTACCAGG GGGAAGTCAC CGAGAAGATG ATGTGTGCAG GCATCCCGGA AGGGGGTGTG 1140 
GACACCTGCC AGGGTGACAG TGGTGGGCCC CTGATGTACC AATCTGACCA GTGGCATGTG 1200 
GTGGGCATCG TTAGCTGGGG CTATGGCTGC GGGGGCCCGA GCACCCCAGG AGTATACACC 1260 
AAGGTCTCAG CCTATCTCAA CTGGATCTAC AATGTCTGGA AGGCTGAGCT GTAA 

Seq ID NO: 449 Protein sequence , 
Protein Accession #i NP_063947.1 

1 11 21 31 41 51 

I I ] I I I 

MLQDPDSDQP LNSLDVKPLR KPRIPMETFR KVGIPIIIAL LSLASIIIW VLIKVILDKY 60 
YFLOGQPLHF IPRKQLCDGE LDCPLGEDEE HCVKSFPEGP AVAVRLSKDR STLQVLDSAT 120 
GNWFSACFDN FTEALAETAC RQMGYSSKPT FRAVEIGPDQ DLDWEITEN SQELRMRNSS 180 
GPCLSGSLVS LHCLACGKSL KTPRWGGEE ASVDSWPWQV SIQYDKQHVC GGSILDPHWV 240 
LTAAHCFRKH TDVFNWKVRA GSDKLGSFPS LAVAKIIIIE FNPMYPKDND IALMKLQFPL 300 
TFSGTVRPIC LPFFDEELTP ATPLWIIGWG FTKQNGGKMS DILLQASVQV IDSTRCNADD 360 
AYQGEVTEKM MCAGIPEGGV DTCQGDSGGP LMYQSDQWHV VGIVSWGYGC GGPSTPGVYT 420 



Seq ID NO: 450 DNA sequence 
Nucleic Acid Accession #: XMJ551860.2 
Coding sequence: 52.. 3042 

1 11 21 31 41 51 

I I I I I I 

GCTCACCCAG GAAAAATATG CAATCGTCCC ATTGATATAC AGGCCACTAC AATGGATGGA 60 
GTTAACCTCA GCACCGAGGT TGTCTACAAA AAAGGCCAGG ATTATAGGTT TGCTTGCTAC 120 
GACCGGGGCA GAGCCTGCCG GAGCTACCGT GTACGGTTCC TCTGTGGGAA GCCTGTGAGG 180 
CCCAAACTCA CAGTCACCAT TGACACCAAT GTGAACAGCA CCATTCTGAA CTTGGAGGAT 240 
AATGTACAGT CATGGAAACC TGGAGATACC CTGGTCATTG CCAGTACTGA TTACTCCATG 300 
TACCAGGCAG AAGAGTTCCA GGTGCTTCCC TGCAGATCCT GCGCCCCCAA CCAGGTCAAA 360 



CCCATCTCAT AGCATCCCCC 1500 

TTTCTTTTAA ATTACAGTTT 1560 

TCATAAAACC TAAGCACTTG 1620 

TTCAGAGAAG AGCCTTCTAA 1680 

AGTTGAAATT TTTAATAGAA 1740 

TAATTATTGC AGATTGATGT 1800 

TCACAGTTGA GACTTAATTT 1860 

GCTTACACGA TCATAATTCA 1920 

GGAAGAGCTA TCTGGAGATT 1980 

ATAATTGTAG TTTGCAAAGT 2040 

GGACAATTTT GTATGGAAAC 2100 

GTAAATTTAT AAACCAGGCA 2160 

ATAAGTGCCT TTTTGGAGAT 2220 

CTAGTTTCTG GGCAGGTTTC 2280 

ATTTGGTAGA GGTGGAATCT 2340 

TGCTGTGCCA TTCAATGTTT 2400 

CTTTTGATCT GTAATGCTTT 2460 
AAAAAAAAAA 



31 41 51 

I I I 

SSSSDDSCDS FASDNFANTR LQSVREGCRT 60 

SRQPSENSVT DSNSDSEDES GMNFLEKRAL 120 

GSDSQSRRPR RRTFPGVASR RNPERRARPL 180 

RKTVDGYMNE DDLPRSRRSR SSVTLPHIIR 240 

HQCRQKTIDT KTNCRNPDCW GVRGQFCGPC 300 

FCRQRDGRCA TGVLVYLAKY HGFGNVHAYL 360 
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GTGGCAGGGA AACCAATGTA CCTGCACATC GGGGAGGAGA TAGACGGCGT GGACATGCGG 420 

GCGGAGGTTG GGCTTCTGAG CCGGAACATC ATAGTGATGG GGGAGATGGA GGACAAATGC 480 

TACCCCTACA GAAACCACAT CTGCAATTTC TTTGACTTCG ATACCTTTQG GGGCCACATC 540 

AAGTTTGCTC TGGGATTTAA GGCAGCACAC TTGGAGGGCA CGGAGCTGAA GCATATGGGA 600 

CAGCAGCTGG TGGGTCAGTA CCCGATTCAC TTCCACCTGG CCGGTGATGT AGACGAAAGG 660 

GGAGGTTATG ACCCACCCAC ATACATCAGG GACCTCTCCA TCCATCATAC ATTCTCTCGC 720 

TGCGTCACAG TCCATGGCTC CAATGGCTTG TTGATCAAGG ACGTTGTGGG CTATAACTCT 780 

TTGGGCCACT GCTTCTTCAC GGAAGATGGG CCGGAGGAAC GCAACACTTT TGACCACTGT 840 

CTTGGCCTCC TTGTCAAGTC TGGAACCCTC CTCCCCTCGG ACCGTGACAG CAAGATGTGC 900 

AAGATGATCA CAGGAGACTC CTACCCAGGG TACATCCCCA AGCCCAGGCA AGACTGCAAT 960 

GCTGTGTCCA CCTTCTGGAT GGCCAATCCC AACAACAACC TCATCAACTG TGCCGCTGCA 1020 

GGATCTGAGG AAACTGGATT TTGGTTTATT TTTCACCACG TACCAACGGG CCCCTCCGTG 1080 

GGAATGTACT CCCCAGGTTA TTCAGAGCAC ATTCCACTGG GAAAATTCTA TAACAACCGA 1140 

GCACATTCCA ACTACCGGGC TGGCATGATC ATAGACAACG GAGTCAAAAC CACCGAGGCC 1200 

TCTGCCAAGG ACAAGCGGCC GTTCCTCTCA ATCATCTCTG CCAGATACAG CCCTCACCAG 1260 

GACGCCGACC CGCTGAAGCC CCGGGAGCCG GCCATCATCA GACACTTCAT TGCCTACAAG 1320 

AACCAGGACC ACGGGGCCTG GCTGCGCGGC GGGGATGTGT GGCTGGACAG CTGCCGGTTT 1380 

GCTGACAATG GCATTGGCCT GACCCTGGCC AGTGGTGGAA CCTTCCCGTA TGACGACGGC 1440 

TCCAAGCAAG AGATAAAGAA CAGCTTGTTT GTTGGCGAGA GTGGCAACGT GGGGACGGAA 1500 

ATGATGGACA ATAGGATCTG GGGCCCTGGC GGCTTGGACC ATAGCGGAAG GACCCTCCCT 1S60 

ATAGGCCAGA ATTTTCCAAT TAGAGGAATT CAGTTATATG ATGGCCCCAT CAACATCCAA 1620 

AACTGCACTT TCCGAAAGTT TGTGGCCCTG GAGGGCCGGC ACACCAGCGC CCTGGCCTTC 1680 

CGCCTGAATA ATGCCTGGCA GAGCTGCCCC CATAACAACG TGACCGGCAT TGCCTTTGAG 1740 

GACGTTCCGA TTACTTCCAG AGTGTTCTTC GGAGAGCCTG GGCCCTGGTT CAACCAGCTG 1800 

GACATGGATG GGGATAAGAC ATCTGTGTTC CATGACGTCG ACGGCTCCGT GTCCGAGTAC 1860 

CCTGGCTCCT ACCTCACGAA GAATGACAAC TGGCTGGTCC GGCACCCAGA CTGCATCAAT 1920 

GTTCCCGACT GGAGAGGGGC CATTTGCAGT GGGTGCTATG CACAGATGTA CATTCAAGCC 1980 

TACAAGACCA GTAACCTGCG AATGAAGATC ATCAAGAATG ACTTCCCCAG CCACCCTCTT .2040 

TACCTGGAGG GGGCGCTCAC CAGGAGCACC CATTACCAGC AATACCAACC GGTTGTCACC 2100 

CTGCAGAAGG GCTACACCAT CCACTGGGAC CAGACGGCCC CCGCCGAACT CGCCATCTGG 2160 

CTCATCAACT TCAACAAGGG CGACTGGATC CGAGTGGGGC TCTGCTACCC GCGAGGCACC 2220 

ACATTCTCCA TCCTCTCGGA TGTTCACAAT CGCCTGCTGA AGCAAACGTC CAAGAGGGGC 2280 

GTCTTCGTGA GGACCTTGCA GATGGACAAA GTGGAGCAGA GCTACCCTGG CAGGAGCCAC 2340 

TACTACTGGG ACGAGGACTC AGGGCTGTTG TTCCTGAAGC TGAAAGCTCA GAACGAGAGA 2400 

GAGAAGTTTG CTTTCTGCTC CATGAAAGGC TGTGAGAGGA TAAAGATTAA AGCTCTGATT 2460 

CCAAAGAACG CAGGCGTCAG TGACTGCACA GCCACAGCTT ACCCCAAGTT CACCGAGAGG 2520 

GCTGTCGTAG ACGTGCCGAT GCCCAAGAAG CTCTTTGGTT CTCAGCTGAA AACAAAGGAC 2580 

CATTTCTTGG AGGTGAAGAT GGAGAGTTCC AAGCAGCACT TCTTCCACCT CTGGAACGAC 2640 

TTCGCTTACA TTGAAGTGGA TGGGAAGAAG TACCCCAGTT CGGAGGATGG CATCCAGGTG 2700 

GTGGTGATTG ACGGGAACCA AGGGCGCGTG GTGAGCCACA CGAGCTTCAG GAACTCCATT 2760 

CTGCAAGGCA TACCATGGCA GCTTTTCAAC TATGTGGCGA CCATCCCTGA CAATTCCATA 2820 

GTGCTTATGG CATCAAAGGG AAGATACGTC TCCAGAGGCC CATGGACCAG AGTGCTGGAA 2880 

AAGCTTGGGG CAGACAGGGG TCTCAAGTTG AAAGAGCAAA TGGCATTCGT TGGCTTCAAA 2940 

GGCAGCTTCC GGCCCATCTG GGTGACACTG GACACTGAGG ATCACAAAGC CAAAATCTTC 3000 

CAAGTTGTGC CCATCCCTGT GGTGAAGAAG AAGAAGTTGT GAGGACAGCT GCCGCCCGGT 3060 

GCCACCTCGT GGTAGACTAT GACGGTGACT CTTGGCAGCA GACCAGTGGG GGATGGCTGG 3120 

GTCCCCCAGC CCCTGCCAGC AGCTGCCTGG GAAGGCCGTG TTTCAGCCCT GATGGGCCAA 3180 

GGGAAGGCTA TCAGAGACCC TGGTGCTGCC ACCTGCCCCT ACTCAAGTGT CTACCTGGAG 3240 

CCCCTGGGGC GGTGCTGGCC AATGCTGGAA ACATTCACTT TCCTGCAGCC TCTTGGGTGC 3300 

TTCTCTCCTA TCTGTGCCTC TTCAGTGGGG GTTTGGGGAC CATATCAGGA GACCTGGGTT 3360 

GTGCTGACAG CAAAGATCCA CTTTGGCAGG AGCCCTGACC CAGCTAGGAG GTAGTCTGGA 3420 

GGGCTGGTCA TTCACAGATC CCCATGGTCT TCAGCAGACA AGTGAGGGTG GTAAATGTAG 3480 

GAGAAAGAGC CTTGGCCTTA AGGAAATCTT TACTCCTGTA AGCAAGAGCC AACCTCACAG 3540 

GATTAGGAGC TGGGGTAGAA CTGGCTATCC TTGGGGAAGA GGCAAGCCCT GCCTCTGGCC 3600 

GTGTCCACCT TTCAGGAGAC TTTGAGTGGC AGGTTTGGAC TTGGACTAGA TGACTCTCAA 3660 

AGGCCCTTTT AGTTCTGAGA TTCCAGAAAT CTGCTGCATT TCACATGGTA CCTGGAACCC 3720 

AACAGTTCAT GGATATCCAC TGATATCCAT GATGCTGGGT GCCCCAGCGC ACACGGGATG 3780 

GAGAGGTGAG AACTAATGCC TAGCTTGAGG GGTCTGCAGT CCAGTAGGGC AGGCAGTCAG 3840 

GTCCATGTGC ACTGCAATGC CAGGTGGAGA AATCACAGAG AGGTAAAATG GAGGCCAGTG 3900 

CCATTTCAGA GGGGAGGCTC AGGAAGGCTT CTTGCTTACA GGAATGAAGG CTGGGGGCAT 3960 

TTTGCTGGGG GGAGATGAGG CAGCCTCTGG AATGGCTCAG GGATTCAGCC CTCCCTGCCG 4020 

CTGCCTGCTG AAGCTGGTGA CTACGGGGTC GCCCTTTGCT CACGTCTCTC TGGCCCACTC 4 080 

ATGATGGAGA AGTGTGGTCA GAGGGGAGCA ATGGGCTTTG CTGCTTATGA GCACAGAGGA 4140 

ATTCAGTCCC CAGGCAGCCC TGCCTCTGAC TCCAAGAGGG TGAAGTCCAC AGAAGTGAGC 4200 

TCCTGCCTTA GGGCCTCATT TGCTCTTCAT CCAGGGAACT GAGCACAGGG GGCCTCCAGG 4260 

AGACCCTAGA TGTGCTCGTA CTCCCTCGGC CTGGGATTTC AGAGCTGGAA ATATAGAAAA 4320 

TATCTAGCCC AAAGCCTTCA TTTTAACAGA TGGGGAAAGT GAGCCCCCAA GATGGGAAAG 4380 

AACCACACAG CTAAGGGAGG GCCTGGGGAG CCCCACCCTA GCCCTTGCTG CCACACCACA 4440 

TTGCCTCAAC AACCGGCCCC AGAGTGCCCA GGCACTCCTG AGGTAGCTTC TGGAAATGGG 4500 

GACAAGTCCC CTCGAAGGAA AGGAAATGAC TAGAGTAGAA TGACAGCTAG CAGATCTCTT 4560 

CCCTCCTGCT CCCAGCGCAC ACAAACCCGC CCTCCCCTTG GTGTTGGCGG TCCCTGTGGC 4620 

CTTCACTTTG TTCACTACCT GTCAGCCCAG CCTGGGTGCA CAGTAGCTGC AACTCCCCAT 4680 

TGGTGCTACC TGGCTCTCCT GTCTCTGCAG CTCTACAGGT GAGGCCCAGC AGAGGGAGTA 4740 

GGGCTCGCCA TGTTTCTGGT GAGCCAATTT GGCTGATCTT GGGTGTCTGA ACAGCTATTG 4800 

GGTCCACCCC AGTCCCTTTC AGCTGCTGCT TAATGCCCTG CTCTCTCCCT GGCCCACCTT 4860 

ATAGAGAGCC CAAAGAGCTC CTGTAAGAGG GAGAACTCTA TCTGTGGTTT ATAATCTTGC 4920 

ACGAGGCACC AGAGTCTCCC TGGGTCTTGT GATGAACTAC ATTTATCCCC TTTCCTGCCC 4980 

CAACCACAAA CTCTTTCCTT CAAAGAGGGC CTGCCTGGCT CCCTCCACCC AACTGCACCC 5040 

ATGAGACTCG GTCCAAGAGT CCATTCCCCA GGTGGGAGCC AACTGTCAGG GAGGTCTTTC 5100 

CCACCAAACA TCTTTCA6CT GCTGGGAGOT GACCATAGGG CTCTGCTTTT AAAGATATGG 5160 

CTGCTTCAAA GGCCAGAGTC ACAGGAAGGA CTTCTTCCAG GGAGATTAGT GGTGATGGAG 5220 

AGGAGAGTTA AAATGACCTC ATGTCCTTCT TGTCCACGGT TTTGTTGAGT TTTCACTCTT 5280 

CTAATGCAAG GGTCTCACAC TGTGAACCAC TTAGGATGTG ATCACTTTCA GGTGGCCAGG 5340 

AATGTTGAAT GTCTTTGGCT CAGTTCATTT AAAAAAQATA TCTATTTGAA AGTTCTCAGA 5400 

GTTGTACATA TGTTTCACAG TACAGGATCT GTACATAAAA GTTTCTTTCC TAAACCATTC 5460 

AGCAAGAGCC AATATCTAGG CATTTTCTTG GTAGCACAAA TTTTCTTATT GCTTAGAAAA 5520 

TTGTCCTCCT TGTTATTTCT GTTTGTAAGA CTTAAGTGAG TTAGGTCTTT AAGGAAAGCA 5580 



354 



WO 02/086443 M 

ACGCTCCTCT GAAATGCTTG TCTTTTTTCT GTTGCCGAAA TAGCTGGTCC TTTTTCGGGA 5640 

GTTAGATGTA TAGAGTGTTT GTATGTAAAC ATTTCTTGTA GGCATCACCA TGAACAAAGA 57 00 

TATATTTTCT ATTTATTTAT TATATGTGCA CTTCAAGAAG TCACTGTCAG AGAAATAAAG 5760 
AATTGTCTTA AATGTCAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAA 

Seq ID NO: 451 Protein sequence 
Protein Accession *h XP_051860.2 

1 11 21 31 41 51 

I I I I I I 

MDGVNLSTEV VYKKGQDYRF ACYDRGRACR SYRVRFLCGK PVRPKLTVTI DTNVNSTILN 60 

LEDNVQSWKP GDTLVXASTD YSMYQAEEFQ VLPCRSCAPN QVKVAGKPMY LHIGEEIDGV 120 

DMRABVGLLS RNI I VMGEME DKCYPYENHI CNFFDFDTFG GHIKFALGFK AAHLEGTELK 180 

HMGQQLVGQY PIHFHLAGDV DERGGYDPPT YIRDLSIHHT PSRCVTVHGS NGLLIKDWG 240 

YNSLGHCPFT BDGPBERNTF DHCLGLLVKS GTLLPSDRDS KMCKMITGDS YPGYIPKPRQ 300 

DCNAVSTFWM ANPNNNLINC AAAGSEETGF WFIFHHVPTG PSVGMYSPGY SEHIPLGKFY 360 

NNRAHSNYRA GMI IDNGVKT TEASAKDKRP FLSIISARYS PHQDADPLKP REPAIIRHPI 420 

AYKNQDHGAW LRGGDVWLDS CRFADNGIGL TLASGGTFPY DDGSKQEIKN SIiFVGESGNV 480 

GTEMMDNRIW GPGGLDHSGR TLPIGQNPPI RGIQLYDGPI NIQNCTPRKP VALEGRHTSA 540 

LAFRLNNAWQ SCPHNNVTGI AFEDVPITSR VFFGEPGPWF NQLDMDGDKT SVFHDVDGSV 600 

SEYPGSYLTK NDNWLVRHPD CINVPDWRGA ICSGCYAQMY IQAYKTSNLR MKIIKNDFPS 660 

HPLYLEGALT RSTHYQQYQP WTLQKGYTI HWDQTAPAEL AIWLINFNKG DWIRVGLCYP 720 

RGTTPSILSD VHNRLLKQTS KTGVFVRTLQ MDKVEQSYPG RSHYYWDEDS GLLFLKLKAQ 780 

NEREKFAFCS MKGCERIKIK ALIPKNAGVS DCTATAYPKF TERAWDVPM PKKLFGSQLK 840 

TKDHFLEVKM ESSKQHFFHL WNDFAYIBVD GKKYPSSEDG IQVWIDGNQ GRWSHTSFR 900 

NSILQGIPWQ LFNYVATIPD NSIVLMASKG RYVSRGPWTR VLEKLGADRG LKLKEQMAFV 960 

GFKGSFRPIW VTLDTEDHKA KIFQWPIPV VKKKKL 

Seq ID MO: 452 DNA sequence 

Nucleic Acid Accession fh Eos sequence 

Coding sequence: 261.. 2861 

1 11 21 31 41 51 

I I I I I I 

GAGCTAGCGC TCAAGCAGAG CCCAGCGCGG TGCTATCGGA CAGAGCCTGG CGAGCGCAAG 60 

CGGCGCGGGG AGCCAGCGGG GCTGAGCGCG GCCAGGGTCT GAACCCAGAT TTCCCAGACT 120 

AGCTACCACT CCGCTTGCCC ACGCCCCGGG AGCTCGCGGC GCCTGGCGGT CAGCGACCAG 180 

ACGTCCGGGG CCGCTGCGCT CCTGGCCCGC GAGGCGTGAC ACTGTCTCGG CTACAGACCC 240 

AGAGGGAGCA CACTGCCAGG ATGGGAGCTG CTGGGAGGCA GGACTTCCTC TTCAAGGCCA 300 

TGCTGACCAT CAGCTGGCTC ACTCTGACCT GCTTCCCTGG GGCCACATCC ACAGTGGCTG 360 

CTGGGTGCCC TGACCAGAGC CCTGAGTTGC AACCCTGGAA CCCTGGCCAT GACCAAGACC 420 

ACCATGTGCA TATCGGCCAG GGCAAGACAC TGCTGCTCAC CTCTTCTGCC ACGGTCTATT 480 

CCATCCACAT CTCAGAGGGA GGCAAGCTGG TCATTAAAGA CCACGACGAG CCGATTGTTT 540 

TGCGAACCCG GCACATCCTG ATTGACAACG GAGGAGAGCT GCATGCTGGG AGTGCCCTCT 600 

GCCCTTTCCA GGGCAATTTC ACCATCATTT TGTATGGAAG GGCTGATGAA GGTATTCAGC 660 

CGGATCCTTA CTATGGTCTG AAGTACATTG GGGTTGGTAA AGGAGGCGCT CTTGAGTTGC 720 

ATGGACAGAA AAAGCTCTCC TGGACATTTC TGAACAAGAC CCTTCACCCA GGTGGCATGG 780 

CAGAAGGAGG CTATTTTTTT GAAAGGAGCT GGGGCCACCG TGGAGTTATT GTTCATGTCA 840 

TCGACCCCAA ATCAGGCACA GTCATCCATT CTGACCGGTT TGACACCTAT AGATCCAAGA 900 

AAGAGAGTGA ACGTCTGGTC CAGTATTTGA ACGCGGTGCC CGATGGCAGG ATCCTTTCTG 960 

TTGCAGTGAA TGATGAAGGT TCTCGAAATC TGGATGACAT GGCCAGGAAG GCGATGACCA 1020 

AATTGGGAAG CAAACACTTC CTGCACCTTG GATTTAGACA CCCTTGGAGT TTTCTAACTG a 080 

TGAAAGGAAA rCCATCATCT TCAGTGGAAG ACCATATTGA ATATCATGGA CATCGAGGCT 1140 

CTGCTGCTGC CCGGGTATTC AAATTGTTCC AGACAGAGCA TGGCGAATAT TTCAATGTTT 1200 

CTTTGTCCAG TGAGTGGGTT CAAGACGTGG AGTGGACGGA GTGGTTCGAT CATGATAAAG 1260 

TATCTCAGAC TAAAGGTGGG GAGAAAATTT CAGACCTCTG GAAAGCTCAC CCAGGAAAAA 1320 

TATGCAATCG TCCCATTGAT ATACAGGCCA CTACAATGGA TGGAGTTAAC CTCAGCACCG 1380 

AGGTTGTCTA CAAAAAAGGC CAGGATTATA GGTTTGCTTG CTACGACCGG GGCAGAGCCT 1440 

GCCGGAGCTA CCGTGTACGG TTCCTCTGTG GGAAGCCTGT GAGGCCCAAA CTCACAGTCA 1500 

CCATTGACAC CAATGTGAAC AGCACCATTC TGAACTTGGA GGATAATGTA CAGTCATGGA 1560 

AACCTGGAGA TACCCTGGTC ATTGCCAGTA CTGATTACTC CATGTACCAG GCAGAAGAGT 1620 

TCCAGGTGCT TCCCTGCAGA TCCTGCGCCC CCAACCAGGT CAAAGTGGCA GGGAAACCAA 1680 

TGTACCTGCA CATCGGGGAG GAGATAGACG GCGTGGACAT GCGGGCGGAG GTTGGGCTTC 1740 

TGAGCCGGAA CATCATAGTG ATGGGGGAGA TGGAGGACAA ATGCTACCCC TACAGAAACC 1800 

ACATCTGCAA TTTCTTTGAC TTCGATACCT TTGGGGGCCA CATCAAGTTT GCTCTGGGAT 1860 

TTAAGGCAGC ACACTTGGAG GGCACGGAGC TGAAGCATAT GGGACAGCAG CTGGTGGGTC 1920 

AGTACCCGAT TCACTTCCAC CTGGCCGGTG ATGTAGACGA AAGGGGAGGT TATGACCCAC 1980 

CCACATACAT CAGGGACCTC TCCATCCATC ATACATTCTC TC GCTG CGTC ACAGTCCATG 2040 

GCTCCAATGG CTTGTTGATC AAGGACGTTG TGGGCTATAA CTCTTTGGGC CACTGCTTCT 2100 

TCACGGAAGA TGGGCCX3GAG GAACGCAACA CTTTTGACCA CTGTCTTGGC CTCCTTGTCA 2160 

AGTCT6GAAC CCTCCTCCCC TCGGACCGTG ACAGCAAGAT GTGCAAGATG ATCACAGAGG 2220 

ACTCCTACCC AGGGTACATC CCCAAGCCCA GGCAAGACTG CAATGCTGTG TCCACCTTCT 2280 

GGATGGCCAA TCCCAACAAC AACCTCATCA ACTGTGCCGC TGCAGGATCT GAGGAAACTG 2340 

GATTTTGGTT TATTTTTCAC CACGTACCAA CGGGCCCCTC CGTGGGAATG TACTCCCCAG 2400 

GTTATTCAGA GCACATTCCA CTGGGAAAAT TCTATAACAA CCGAGCACAT TCCAACTACC 2460 

GGG CTGGCAT GATCATAGAC AACGGAGTCA AAACCACCGA GGCCTCTGCC AAGGACAAGC 2520 

GGCCGTTCCT CTCAATCATC TCTGCCAGAT ACAGCCCTCA CCAGGACGCC GACCOGCTGA 2580 

AGCCCCGGGA GCCGGCCATC ATCAGACACT TCATTGCCTA CAAGAACCAG GACCACGGGG 2640 

CCTGGCTGCG CGGCGGGGAT GTGTGGCTGG ACAGCTGCCA TTTCAGAGGG GAGGCTCAGG 2700 

AAGGCTTCTT GCTTACAGGA ATGAAGGCTG GGGGCATTTT GCTGGGGGGA GATGAGGCAG 2760 

CCTCTGGAAT GGCTCAGGGA TTCAGCCCTC CCTGCCGCTG CCTGCTGAAG CTGGTGACTA 2820 

CGGGGTCGCC CTTTGCTCAC GTCTCTCTGG CCCACTCATG ATGGAGAAGT GTGGTCAGAG 2880 

GGGAGCAATG GGCTTTGCTG CTTATGAGCA CAGAGGAATT CAGTCCCCAG GCAGCCCTGC 2940 

CTCTGACTCC AAGAGGGTGA AGTCCACAGA AGTGAGCTCC TGCCTTAGGG CCTCATTTGC 3000 

TCTTCATCCA GGGAACTGAG CACAGGGGGC CTCCAGGAGA CCCTAGATGT GCTCGTACTC 3060 

CCTCGGCCTG GGATTTCAGA GCTGGAAATA TAGAAAATAT CTAGCCCAAA GCCTTCATTT 3120 
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TAACAGATGG GGAAAGTGAG CCCCCAAGAT GGGAAAGAAC CACACAGCTA AGGGAGGGCC 3180 

TGGGGAGCCC CACCCTAGCC CTTGCTGCCA CACCACATTG CCTCAACAAC CGGCCCCAGA 3240 

GTGCCCAGGC ACTCCTGAGO TAGCTTCTGG AAATGGGGAC AAGTCCCCTC GAAGGAAAGG 3300 

AAATGACTAG AGTAGAATGA CAGCTAGCAG ATCTCTTCCC TCCTGCTCCC AGCGCACACA 3360 

AACCOGCCCT OCCCTTGGTG TTGGCGGTCC CTGTGGCCTT CACTTTGTTC ACTACCTGTC 3420 

AGCCCAGCCT GGGTGCACAG TAGCTGCAAC TCCCCATTGG TGCTACCTGG CTCTCCTGTC 3480 

TCTGCAGCTC TACAGGTGAG GCCCAGCAGA GGGAGTAGGG CTCGCCATGT TTCTGGTGAO 3540 

CCAATTTGGC TGATCTTGGG TGTCTGAACA GCTATTGGGT CCACCCCAGT CCCTTTCAGC 3600 

TGCTGCTTAA TGCCCTGCTC TCTCCCTGGC CCACCTTATA GAGAGCCCAA AGAGCTCCTG 3660 

TAAGAGGGAG AACTCTATCT GTGGTTTATA ATCTTGCACG AGGCACCAGA GTCTCCCTGG 3720 

GTCTTGTGAT GAACTACATT TATCCCCTTT CCTGCCCCAA CCACAAACTC TTTCCTTCAA 3780 

AGAGGGCCTG CCTGGCTCCC TCCACCCAAC TGCACCCATG AGACTCGGTC CAAGAGTCCA 3840 

TTCCCCAGGT GGGAGCCAAC TGTCAGGGAG GTCTTTCCCA CCAAACATCT TTCAGCTGCT 3900 

GGGAGGTGAC CATAGGGCTC TGCTTTTAAA GATATGGCTG CTTCAAAGGC CAGAGTCACA 3960 

GGAAGGACTT CTTCCAGGGA GATTAGTGGT GATGGAGAGG AGAGTTAAAA TGACCTCATG 4020 

TCCTTCTTGT CCACGGTTTT GTTGAGTTTT CACTCTTCTA ATGCAAGGGT CTCACACTGT 4080 

GAACCACTTA GGATGTGATC ACTTTCAGGT GGCCAGGAAT GTTGAATGTC TTTGGCTCAG 4140 

TTCATTTAAA AAAGATATCT ATTTGAAAGT TCTCAGAGTT GTACATATGT TTCACAGTAC 4200 

AGGATCTGTA CATAAAAGTT TCTTTCCTAA ACCATTCACC AAGAGCCAAT ATCTAGGCAT 4260 

TTTCTTGGTA GCACAAATTT TCTTATTGCT TAGAAAATTG TCCTCCTTGT TATTTCTGTT 4320 

TGTAAGACTT AAGTGAGTTA GGTCTTTAAG GAAAGCAACG CTCCTCTGAA ATGCTTGTCT 4380 

TTTTTCTGTT GCCGAAATAG CTGGTCCTTT TTCGGGAGTT AGATGTATAG AGTGTTTGTA 4440 

TGTAAACATT TCTTGTAGGC ATCACCATGA ACAAAGATAT ATTTTCTATT TATTTATTAT 4500 

ATGTGCACTT CAAGAAGTCA CTGTCAGAGA AATAAAGAAT TGTCTTAAAT GTCATGATTG 4560 

GAGATGTCCT TTGCATTGCT TGGAAGGGGT GTACCTAGAG CCAAGGAAAT TGGCTCTGGT 4620 

TTGGAAAAAT TTTGCTGTTA TTATAGTAAA CATACAAAGG ATGTCAAAAA AAAAAAAAAA 4680 
AAAAAAAAAA AAAAAAAAAA AA 

Seq ID NO: 453 Protein sequence 
Protein Accession #: Eos sequence 

1 11 21 31 41 51 

I I I I I I 

MGAAGRQDFL PKAMLTISWI* TLTCFPGATS TVAAGCPDQS PELQPWNPGH DQDHHVHIGQ 60 

GKTLLLTSSA TVYS1HISEG GKLVIKDHDE PIVLRTRHIL IDNGGELHAG SALCPFQGNF 120 

TIILYGRADE GIQPDPYYGL. KYIGVGKGGA LELKGQKKLS WTFLNKTLHP GGMAEGGYFF 180 

ERSWGHRGVI VHVIDPKSGT VIHSDRFDTY RSKKESERLV QYLNAVPDGR ILSVAVNDEG 240 

SRNLDDMARK AMTKLGSKHF LHLGFRHPWS FLTVKGNPSS SVEDHIEYHG HRGSAAARVF 300 

KLFQTEHGEY FNVSLSSEWV QDVEWTEWFD HDKVSQTKGG EKISDLWKAH PGKICMRPID 360 

IQATTMDGVN LSTEWYKKG QDYRFACYDR GRACRSYRVR FLCGKPVRPK LTVTIDTNVN 420 

STILNLEDNV QSWKPGDTLV IASTDYSMYQ AEEFQVLPCR SCAPNQVKVA GKPMYLHIGE 480 

EIDGVDMRAE VGLLSRNIIV MGEMEDKCYP YRNHICNFFD FPTFGGHIKF ALGFKAAHLE 540 

GTBLKHMGQQ LVGQYPIHFH LAGDVDERGG YDPPTYIRDL SIHHTFSRCV TVHGSNGLLI 600 

KDWGYNSLG HCFFTEDGPE ERNTFDHCLG LLVKSGTLLP SDRDSKMCKM ITEDSYPGYI 660 

PKPRQDCNAV STFWMANPNN NLINCAAAGS EETGFWFIFH HVPTGPSVGM YSPGYSEHIP 720 

LGKFYNNRAH SNYRAGMI ID NGVK.TTEASA KDKRPFLSII SARYSPHQDA DPLKPREPAI 780 

IRHFIAYKNQ DHGAWLRGGD VWLDSCHFRG EAQEGFLLTG MKAGGILLGG DEAASGMAQG 840 
FSPPCRCLLK LVTTGSPFAH VSLAHS 

Seq ID NO: 454 DNA sequence 

Nucleic Acid Accession ft: NMJ>13282.2 

Coding sequence: 85.. 2466 

1 11 21 31 41 51 

I I I I I I 

CGACTCCTTA GAGCATGGCA TGGCTCAGAG GTGCTGGTAA AACTGATGGG GGTTTTTGCT 60 

GTCCCTCCCC TCAGCGCCGA CACCATGTGG ATCCAGGTTC GGACCATGGA CGGGAGGCAG 120 

ACCCACACGG TGGACTCGCT GTCCAGGCTG ACCAAGGTGG AGGAGCTGAG GCGGAAGATC 180 

CAGGAGCTGT TCCACGTGGA GCCAGGCCTG CAGAGGCTGT TCTACAGGGG CAAACAGATG 240 

GAGGACGGCC ATACCCTCTT CGACTACGAG GTCCGCCTGA ATGACACCAT CCAGCTCCTG 300 

GTCCGCCAGA GCCTCGTGCT CCCCCACAGC ACCAAGGAGC GGGACTCCGA GCTCTCCGAC 360 

ACCGACTCCG GCTGCTGCCT GGGCCAGAGT GAGTCAGACA AGTCCTCCAC CCACGGCGAG 420 

GCGGCCGCOG AGACTGACAG CAGGCCAGCC GATGAGGACA TGTGGGATGA GACGGAATTG 480 

GGGCTGTACA AGGTCAATGA GTACGTCGAT GCTCGGGACA CGAACATGGG GGCGTGGTTT 540 

GAGGCGCAGG TGGTCAGGGT GACGCGGAAG GCCCCCTCCC GGGACGAGCC CTGCAGCTCC 600 

ACGTCCAGGC CGGCGCTGGA GGAGGACGTC ATTTACCACG TGAAATACGA CGACTACCCG 660 

GAGAAOGGCG TGGTCCAGAT GAACTCCAGG GACGTCCGAG CGCGCGCCCG CACCATCATC 720 

AAGTGGCAGG ACCTGGAGGT GGGCCAGGTG GTCATGCTCA ACTACAACCC CGACAACCCC 780 

AAGGAGCGGG GCTTCTGGTA CGACGCGGAG ATCTCCAGGA AGCGCGAGAC CAGGACGGCX3 840 

CGGGAACTCT ACGCCAACGT GGTGCTGGGG GATGATTCTC TGAACGACTG TCGGATCATC 900 

TTCGTGGACG AAGTCTTCAA GAT7GAGCGG CCGGGTGAAG GGAGCCCCAT GGTTGACAAC 960 

CCCATGAGAC GGAAGAGCGG GCCGTCCTGC AAGCACTGCA AGGACGACGT GAACAGACTC 1020 

TGCCGGGTCT GCGCCTGCCA CCTGTGCGGG GGCCGGCAGG ACCCCGACAA GCAGCTCATG 1080 

TGCGATGAGT GCGACATGGC CTTCCACATC TACTGCCTGG ACCCGCCCCT CAGCAGTGTT 1140 

CCCAGCGAGG ACGAGTGGTA CTGCCCTGAG TGCCGGAATG ATGCCAGCGA GGTGGTACTG 1200 

GCGGGAGAGC GGCTGAGAGA GAGCAAGAAG AAGGOGAAGA TGGCCTCGGC CACATCGTCC 1260 

■ TCACAGCGGG ACTGGGGCAA GGGCATGGCC TGTGTGGGCC GCACCAAGGA ATGTACCATC 1320 

GTCCCGTCCA ACCACTACGG ACCCATCCCG GGGATCCCCG TGGGCACCAT GTGGCGGTTC 1380 

CGAGTCCAGG TCAGCGAGTC GGGTGTCCAT CGGCCCCACG TGGCTGGCAT ACACGGCCGG 1440 

AGCAACGACG GAGCGTACTC CCTAGTCCTG GCGGGGGGCT ATGAGGATGA CGTGGACCAT 1500 

GGGAATTTTT TCACATACAC GGGTAGTGGT GGTCGAGATC TTTCCGGCAA CAAGAGGACC 1560 

GCGGAACAGT CTTGTGATCA GAAACTCACC AACACCAACA GGGCGCTGGC TCTCAACTGC 1620 

TTTGCTCCCA TCAATGACCA AGAAGGGGCC GAGGCCAAGG ACTGGCGGTC GGGGAAGCCG 1680 

GTCAGGGTGG TGCGCAATGT CAAGGGTGGC AAGAATAGCA AGTACGCCCC CGCTGAGGGC 1740 

AACCGCTACG ATGGCATCTA CAAGGTTGTG AAATACTGGC CCGAGAAGGG GAAGTCCGGG 1800 

TTTCTCGTGT GGCGCTACCT TCTGCGGAGG GACGATGATG AGCCTGGCCC TTGGACGAAG 1860 

GAGGGGAAGG ACCGGATCAA GAAGCTGGGG CTGACCATGC AGTATCCAGA AGGCTACCTG 1920 
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GAAGCCCTGG CCAACCGAGA GCGAGAGAAG GAGAACAGCA AGAGGGAGGA GGAGGAGCAG 1980 

CAGGAGGGGG GCTTCGCGTC CCCCAGGACG GGCAAGGGCA AGTGGAAGCG GAAGTCGGCA 2040 

GGAGGTGGCC CGAGCAGGGC CGGGTCCCCG CGCCGGACAT CCAAGAAAAC CAAGGTGGAG 2100 

CCCTACAGTC TCACGGCCCA GCAGAGCAGC CTCATCAGAQ AGGACAAGAG CAACGCCAAG 2160 

CTGTGGAATG AGGTCCTGGC GTCACTCAAG GACCGGCCGG OGAGCGGCAG CCCG TTCCA G 2220 

TTGTTCCTGA GTAAAGTGGA GGAGACGTTC CAGTGTATCT GCTGTCAGGA GCTGGTGTTC 2280 

CGGCCCATCA CGACCGTGTG CCAGCACAAC GTGTGCAAGG ACTGCCTGGA CAGATCCTTT 2340 

CGGGCACAGG TGTTCAGCTG CCCTGCCTGC CGCTACGACC TGGGCCGCAG CTATGCCATG 2400 

CAGGTGAACC AGCCTCTGCA GACCGTCCTC AACCAGCTCT TCCCCGGCTA CGGCAATGGC 2460 

CGGTGATCTC CAAGCACTTC TCGACAGGCG TTTTGCTGAA AACGTGTCGG AGGGCTCGTT 2520 

CATCGGCACT GATTTTGTTC TTAGTGGGCT TAACTTAAAC AGGTAGTGTT TCCTCCGTTC 2580 

CCTAAAAAGG TTTGTCTTCC TTTTTTTTTA TTTTTATTTT TCAAATCTAT ACATTTTCAG 2640 

GAATTTATGT ATTCTGGCTA AAAGTTGGAC TTCTCAGTAT TGTGTTTAGT TCTTTGAAAA 2700 

CATAAAAGCC TGCAATTTCT CGACAAAACA ACACAAGATT TTTTAAAGAT GGAATCAGAA 2760 

ACTACGTGGT GTGGAGGCTG TTGATGTTTC TGGTGTCAAG TTCTCAGAAG TTGCTGCCAC 2820 

CAACTCTTTA AGAAGGCGAC AGGATCAGTC CTTCTCTAGG GTTCTGGCCC CCAAGGTCAG 2880 

AGCAAGCATC TTCCTGACAG CATTTTGTCA TCTAAAGTCC AGTGACATGG TTCCCCGTGG 2940 

TGGCCCGTGG CAGCCCGTGG CATGGCGTGG CTCAGCTGTC TGTTGAAGTT GTTGCAAGGA 3000 

AAAGAGGAAA CATCTCGGGC CTAGTTCAAA CCTTTGCCTC AAAGCCATCC CCCACCAGAC 3060 

TGCTTAGCGT CTGAGATCCG CGTGAAAAGT CCTCTGCCCA CGAGAGCAGG GAGTTGGGGC 3120 

CACGCAGAAA TGGCCTCAAG GGGACTCTGC TCCACGTGGG GCCAGGCGTG TGACTGACGC 3180 

TGTCCGACGA AGGCGGCCAC GGACGGACGC CAGCACACGA AGTCACGTGC AAGTGCCTTT 3240 

GATTCGTTCC TTCTTTCTAA AGACGACAGT CTTTGTTGTT AGCACTGAAT TATTGAAAAT 3300 

GTCAACCAGA TTCTAGAAAC TGCGGTCATC CAGTTCTTCC TGACACCGGA TGGGTGCTTG 3360 

GGAACCGTTT GAGCCTTATA GATCATTTAC ATTCAATTTT TTTAACTCAG CAAGTGAGAA 3420 

' CTTACAAGAG GGTTTTTTTT TAATTTTTTT TTCTCTTAAT GAACACATTT TCTAAATGAA 3480 

TTTTTTTTGT AGTTACTGTA TATGTACCAA GAAAGATATA ACGTTAGGGT TTGGTTGTTT 3540 

TTGTTTTTGT ATTTTTTTTC TTTTGAAAGG GTTTGTTAAT TTTTCTAATT TTACCAAAGT 3600 

TTGCAGCCTA TACCTCAATA AAACAGGGAT ATTTTAAATC ACATACCTGC AGACAAACTG 3660 

GAGCAATGTT ATTTTTAAAG GGTTTTTTTC ACCTCCTTAT TCTTAGATTA TTAATGTATT 3720 

AGGGAAGAAT GAGACAATTT TGTGTAGGCT TTTTCTAAAG TCCAGTACTT TGTCCAGATT 3780 
TTAGATTCTC AGAATAAATG TTTTTCACAG ATTGAAAAAA AAAAAAAA 

Seq ID NO: 455 Protein sequence 
Protein Accession ft: NP_037414.2 

1 11 21 31 41 51 

I I I I I I 

MWIQVRTMDG RQTHTVDSLS RLTKVEELRR KIQELFHVEP GLQRLFYRGK QMEDGHTLFD 60 

YEVRLNDTIQ LLVRQSLVLP HSTKERDSEL SDTDSGCCLG QSESDKSSTH GEAAAETDSR 120 

PADEDMWDET ELGLYKVNEY VDARDTNMGA WFEAQWRVT RKAPSRDEPC SSTSRPALEE 180 

DVIYHVKYDD YPENGWQMN SRDVRARART IIKWQDLEVG QWMLNYNPD NPKERGFWYD 240 

AEISRKRETR TARELYANW LGDDSLNDCR IIFVDEVFKI ERPGEGSPMV DNPMRRKSGP 300 

SCKHCKDDVN RLCRVCACHL CGGRQDPDKQ LMCDECDMAF HIYCLDPPLS SVPSEDEWYC 360 

PECRNDASEV VLAGERLRES KKKAKMASAT SSSQRDWGKG MACVGRTKEC TIVPSNHYGP 420 

IPGIPVGTMW RFRVQVSESG VHRPHVAGIH GRSNDGAYSL VLAGGYEDDV DHGNFFTYTG 4B0 

SGGRDLSGNK RTAEQSCDQK LTNTNRALAL NCFAPINDQE GAEAKDWRSG KPVRWRNVK 540 

GGKNSKYAPA EGNRYDGIYK WKYWPEKGK SGFLVWRYLL RRDDDEPGPW TKEGKDRIKK 600 

LGLTMQYPEG YfcEALANRBR EKENSKREEE EQQEGGFASP RTGKGKWKRK SAGGGPSRAG 660 

SPRRTSKKTK VEPYSLTAQQ SSLIREDKSN AKLWNEVLAS LKDRPASGSP FQIiFLSKVEE 720 
TFQCICCQEL VFRPITTVCQ HNVCKDCLDR SFRAQVFSCP ACRYDLGRSY AMQVNQPLQT 

Seq ID NO: 456 DNA sequence 

Nucleic Acid Accession #t NM_001200.1 . 

Coding sequence: 325.. 1514 

1 11 21 31 41 51 
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GGGGACTTCT TGAACTTGCA GGGAGAATAA CTTGCGCACC CCACTTTGCG CCGGTGCCTT 60 

TGCCCCAGCG GAGCCTGCTT CGCCATCTCC GAGCCCCACC GCCCCTCCAC TCCTCGGCCT 120 

TGCCCGACAC TGAGACGCTG TTCCCAGCGT GAAAAGAGAG ACTGCGCGGC CGGCACCCGG 180 

GAGAAGGAGG AGGCAAAGAA AAGGAACGGA CATTCGGTCC TTGCGCCAGG TCCTTTGACC 240 

AGAGTTTTTC CATGTGGACG CTCTTTCAAT GGACGTGTCC CCGCGTGCTT CTTAGACGGA 300 

CTGCGGTCTC CTAAAGGTCG ACCATGGTGG CCGGGACCCG CTGTCTTCTA GCGTTGCTGC 360 

TTCCCCAGGT CCTCCTGGGC GGCGCGGCTG GCCTCGTTCC GGAGCTGGGC CGCAGGAAGT 420 

TCGCGGCGGC GTCGTGGGGC CGCCCCTCAT CCCAGCCCTC TGACGAGGTC CTGAGCGAGT 480 

TCGAGTTGCG GCTGCTCAGC ATGTTCGGCC TGAAACAGAG ACCCACCCCC AGCAGGGACG 540 

CCGTGGTGCC CCCCTACATG CTAGACCTGT ATCGCAGGCA CTCAGGTCAG CCGGGCTCAC 600 

CCGCCCCAGA CCACCGGTTG GAGAGGGCAG CCAGCGGAGC CAACACTGTG CGCAGCTTCC 660 

ACCATGAAGA ATCTTTGGAA GAACTACCAG AAACGAGTGG GAAAACAACC CGGAGATTCT 720 

TCTTTAATTT AAGTTCTATC CCCACGGAGG AGTTTATCAC CTCAGCAGAG CTTCAGGTTT 780 

TCCGAGAACA GATGCAAGAT GCTTTAGGAA ACAATAGCAG TTTCCATCAC CGAATTAATA 840 

TTTATGAAAT CATAAAACCT GCAACAGCCA ACTCGAAATT CCCCGTGACC AGACTTTTGG 900 

ACACCAGGTT GGTGAATCAG AATGCAAGCA GGTGGGAAAG TTTTGATGTC ACCCCCGCTG 960 

TGATGCGGTG GACTGCACAG GGACACGCCA ACCATGGATT GGTGGTGGAA GTGGCCCACT 1020 

TGGAGGAGAA ACAAGGTGTC TCCAAGAGAC ATGTTAGGAT AAGCAGGTCT TTGCACCAAG 1080 

ATGAACACAG CTGGTCACAG ATAAGGCCAT TGCTAGTAAC TTTTGGCCAT GATGGAAAAG 1140 

GGCATCCTCT CCACAAAAGA GAAAAACGTC AAGCCAAACA CAAACAGCGG AAACGCCTTA 1200 

AGTCCAGCTG TAAGAGACAC CCTTTGTACG TGGACTTCAG TGACGTGGGG TGGAATGACT 1260 

GGATTGTGGC TCCCCCGGGG TATCACGCCT TTTACTGCCA CGGAGAATGC CCTTTTCCTC 1320 

TGGCTGATCA TCTGAACTCC ACTAATCATG CCATTGTTCA GACGTTGGTC AACTCTGTTA 1380 

ACTCTAAGAT TCCTAAGGCA TGCTGTGTCC CGACAGAACT CAGTGCTATC TCGATGCTGT 1440 

ACCTTGACGA GAATGAAAAG GTTGTATTAA AGAACTATCA GGACATGGTT GTGGAGGGTT 1500 
GTGGGTGTCG CTAGTACAGC AAAATTAAAT ACATAAATAT ATATATA 
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MVAGTRCLLA LLLPQVLLGG AAGLVPELGR RKFAAASSGR 
FGLKQRPTPS RDAWPPYML DLYRRHSGQP GSPAPDHRLE 
LPSTSGKTTR RPFFNLSSIP TEEPITSAEL QVFREQMQDA 
TANSKFPVTR LtiDT 

Seq ID NO: 458 DNA sequence 
Nucleic Acid Accession #: NM_001999.2 
Coding sequence: 1..8736 

1 11 21 31 41 51 

I I I I I I ; 

ATGGGGAGAA GACGGAGGCT GTGTCTCCAG CTCTACTTCC TGTGGCTGGG CTGTGTGGTG 60 

CTCTGGGCGC AGGGCACGGC CGGCCAGCCT CAGCCTCCTC CGCCCAAGCC GCCCCGGCCC 120 

CAGCCGCCGC OGCAACAGGT TCGGTCCGCT ACAGCAGGCT CTGAAGGCGG GTTTCTAGCG 180 

CCCGAGTATC GCGAGGAGGG TGCCGCAGTG GCCAGCCGCG TCCGCCGGCG AGGACAGCAG 240 

GACGTGCTCC GAGGGCCCAA CGTGTGCGGC TCCAGATTCC ACTCCTACTG CTGCCCTGGA 300 

TGGAAGACGC TCCCTGGAGG AAACCAGTGC ATTGTCCCGA TTTGTAGAAA TAGTTGTGGA 360 

GATGGATTTT GTTCCCGTCC TAACATGTGT ACTTGTTCCA GTGGGCAAAT ATCATCAACC 420 

TGTGGATCAA AATCAATTCA GCAGTGCAGT GTGAGATGCA TGAATGGTGG GACCTGTGCA 480 

GATGACCACT GCCAGTGCCA GAAAGGATAT ATTGGAACTT ATTGTGGACA ACCTGTCTGT 540 

GAAAATGGAT GTCAGAATGG TGGACGTTGC ATCGCCCAAC CGTGTGCTTG TGTTTATGGG 600 

TTCACTGGTC CACAGTGTGA AAGAGATTAC AGGACAGGCC CGTGTTTCAC TCAGGTCAAC 660 

AACCAGATGT GCCAAGGGCA GCTGACAGGC ATTGTCTGCA CGAAGACTCT GTGCTGTGCC 720 

ACCACTGGAC GGGCGTGGGG CCATCCCTGT GAGATGTGTC CAGCCCAGCC TCAGCCCTGC 780 

CGACGGGGTT TCATCCCCAA CATCCGCACT GGAGCTTGCC AAGATGTTGA TGAATGCCAG 840 

GCTATCCCAG GGATATGCCA AGGAGGAAAC TGTATCAATA CAGTGGGCTC TTTTGAATGC 900 

AGATGCCCTG CTGGTCACAA ACAGAGTGAA ACTACTCAGA AATGTGAAGA CATTGATGAG 960 

TGCAGCATCA TTCCTGGGAT ATGTGAAACT GGTGAATGTT CCAACACCGT GGGAAGCTAT 1020 

TTTTGTGTTT GTCCACGTGG ATATGTAACC TCAACAGATG GCTCTCGATG CATCGATCAG 1080 

AGAACAGGCA TGTGTTTCTC GGGCCTGGTG AATGGCCGCT GTGCACAAGA GCTCCCGGGG 1140 

AGAATGACGA AAATGCAGTG CTGCTGTGAG CCTGGCCGCT GCTGGGGCAT CGGAACCATT 1200 

CCTGAAGCCT GTCCTGTCAG AGGTTCTGAG GAATATCGCA GACTTTGCAT GGATGGACTT 1260 

CCAATGGGAG GAATTCCAGG GAGTGCTGGT TCCAGACCTG GAGGCACTGG GGGAAATGGC 1320 

TTTGCCCCAA GTGGCAATGG CAATGGCTAT GGCCCAGGAG GGACAGGCTT CATCCCCATC 1380 

CCTGGAGGCA ATGGCTTTTC TCCTGGCGTT GGGGGAGCCG GTGTGGGGGC CGGGGGACAG 1440 

GGACCTATCA TCACTGGACT AACAATTCTG AACCAGACAA TAGATATCTG TAAGCATCAT 1500 

GCTAACCTTT GTTTAAATGG ACGCTGTATA CCAACTGTCT CAAGCTACCG ATGTGAATGC 1560 

AACATGGGTT ATAAGCAGGA TGCAAATGGA GATTGTATAG ATGTTGATGA ATGCACATCA 1620 

AATCCCTGCA CTAATGGAGA TTGTGTTAAC ACACCTGGTT CCTATTATTG TAAATGTCAT 1680 

GCTGGATTCC AGAGGACTCC TACCAAGCAA GCATGCATTG ATATTGATGA GTGCATCCAG 1740 

AATGGGGTTC TTTGTAAAAA CGGTCGATGC GTGAACTCAG ATGGAAGTTT CCAGTGCATT 1800 

TGCAATGCCG GCTTTGAATT AACTACAGAT GGAAAAAACT GTGTTGATCA TGATGAATGT 1860 

ACAACTACCA ACATGTGTTT GAATGGAATG TGCATCAATG AAGATGGCAG CTTCAAGTGC 1920 

ATCTGCAAAC CAGGATTTGT CTTGGCTCCA AATGGGCGTT ACTGTACTGA TGTTGATGAA 1980 

TGCCAGACCC CAGGAATCTG CATGAATGGG CACTGCATCA ACAGTGAAGG GTCCTTCCGC 2040 

TGTGACTGTC CCCCAGGCCT GGCTGTGGGC ATGGATGGAC GTGTGTGTGT TGATACTCAC 2100 

ATGCGCAGTA CCTGCTATGG AGGAATCAAG AAAGGAGTGT GTGTGCGTCC TTTCCCCGGT 2160 

GCAGTGACCA AGTCCGAATG CTGCTGTGCC AATCCAGACT ATGGTTTTGG AGAACCCTGC 2220 

CAGCCATGCC CTGCAAAAAA TTCAGCTGAA TTCCACGGCC TTTGTAGTAG TGGAGTAGGT 2280 

ATCACTGTGG ATGGAAGAGA TATCAATGAA TGTGCTTTGG ATCCTGATAT ATGTGCCAAT 2340 

GGGATTTGTG AAAACTTACG TGGTAGTTAC CGTTGTAATT GCAACAGTGG CTATGAACCA 2400 

GATGCCTCTG GAAGAAACTG TATTGACATT GATGAATGTT TAGTAAACAG ACTGCTTTGT 2460 

GATAACGGAT TGTGCCGAAA CACGCCAGGA AGTTACAGCT GTACGTGCCC ACCAGGGTAT 2520 

GTGTTCAGGA CTGAGACAGA GACCTGTGAA GATATAAATG AATGTGAAAG CAACCCATGT 2580 

GTCAATGGGG CCTGCAGAAA CAAGCTTGGA TCTTTCAATT GTGAATGTTC GCCCGGCAGC 2 640 

AAACTCAGCT CCACAGGATT GATCTGTATT GACAGCCTGA AGGGGACCTG TTGGCTCAAC 2700 

ATCCAGGACA GCCGCTGTGA GGTGAATATT AATGGAGCCA CTCTGAAATC TGAATGCTGT 2760 

GCCACCCTCG GAGCCGCCTG GGGGAGCCCC TGTGAGCGGT GTGAACTAGA TACAGCTTGC 2820 

CCAAGAGGGC TTGCCAGGAT TAAAGGTGTT ACGTGTGAAG ATGTTAATGA GTGTGAGGTG 2880 

TTCCCTGGCG TTTGTCCAAA TGGACGCTGT GTCAACAGTA AGGGATCTTT TCATTGCGAG 2940 

TGCCCTGAAG GCCTTACGTT GGATGGGACT GGCCGTGTAT GTTTGGATAT TCGCATGGAG 3000 

CAGTGTTACT TGAAGTGGGA TGAAGATGAA TGCATCCACC CCGTTCCTGG AAAGTTCCGC 3060 

ATGGATGCCT GCTGCTGTGC TGTCGGGGCG GCTTGGGGCA CCGAGTGTGA GGAGTGCCCC 3120 

AAACCTGGCA CCAAGGAATA CGAGACACTG TGCCCCCGCG GGGCTGGCTT TGCTAACCGA 3180 

GGGGATGTTC TTACTGGGCG GCCATTTTAC AAAGACATCA ATGAATGCAA AGCATTTCCT 3240 

GGGATGTGCA CTTATGGGAA GTGCAGAAAT ACAATCGGAA GCTTCAAATG CCGTTGCAAT 3300 

AGTGGCTTTG CTCTAGACAT GGAGGAAAGA AACTGCACGG ACATCGACGA GTGCAGGATT 3360 

TCTCCTGACC TCTGTGGCAG TGGAATCTGC GTCAATACAC CGGGCAGCTT TGAGTGCGAG 3420 

TGCTTCGAAG GCTATGAAAG TGGCTTCATG ATGATGAAGA ACTGCATGGA CATTGACGGA 3480 

TGTGAACGTA ACCCTCTCCT TTGTAGGGGT GGCACCTGTG TGAACACTGA GGGCAGCTTT 3540 

CAGTGTGACT GCCCACTGGG ACACGAGCTG TCACCATCCC GTGAGGACTG TGTGGATATT 3600 

AATGAATGCT CCCTGAGTGA CAATCTCTGC AGAAATGGAA AATGTGTGAA CATGATTGGA "3660 

ACCTATCAGT GCTCTTGCAA TCCTGGATAT CAGGCTACGC CAGACCGCCA GGGCTGTACA 3720 

GATATTGATG AATGTATGAT AATGAACGGA GGCTGTGACA CCCAGTGCAC AAATTCAGAG 3780 

GGAAGCTACG AATGCAGCTG CAGTGAGGGT TATGCCCTGA TGCCAGATGG GAGATCGTGT 3840 

GCAGACATTG ATGAATGTGA AAACAATCCT GATATCTGTG ATGGCGGCCA GTGTACCAAC 3900 

ATTCCTGGAG AGTATCGCTG CCTCTGCTAT GATGGCTTCA TGGCTTCCAT GGACATGAAA 3960 

ACATGCATTG ATGTCAATGA ATGTGACCTA AATTCAAATA TCTGCATGTT TGGGGAATGT 4020 

GAGAACACAA AGGGATCCTT CATTTGCCAC TGTCAGCTGG GTTACTCAGT GAAGAAGGGG 4080 

ACCACAGGAT GTACAGATGT GGATGAGTGT GAAATTGGTG CTCATAACTG CGACATGCAT 4140 

GCCTCATGTC TGAATATCCC AGGAAGCTTC AAGTGTAGCT GCAGAGAAGG CTGGATTGGA 4200 

AACGGCATCA AGTGTATTGA TCTGGACGAA TGTTCTAATG GAACCCACCA GTGTAGCATC 4260 

AATGCTCAGT GTGTAAATAC CCCGGGCTCA TACCGCTGTG CCTGCTCCGA AGGTTTCACT 4320 

GGTGATGGCT TTACCTGCTC AGATGTTGAT GAGTGTGCAG AAAACATAAA CCTCTGTGAG 4380 
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LGNNSSFHHR INIYEIIKPA 180 
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AACGGACAGT GCCTTAATGT CCCGGGTGCA TATCGCTGCG AGTGTGAGAT GGGCT TCACT 4440 

CCAGCCTCAG ACAGCAGATC CTGCCAAGAT ATTGATGAAT GCTCCTTCCA AAACATTTGT 4500 

GTCTCTGGAA CATGTAATAA CCTGCCTGGA ATGTTTCATT GCATCTGCGA TGATGGTTAT 4560 

GAATTGGACA GAACAGGAGG GAACTGTACA GATATTGATG AGTGTGCAGA TCCTATAAAC 4620 

TGTGTCAATG GCCTATGTGT CAACAOGCCT GGTCGCTATG AGTGTAACTG CCCACCCGAT 4680 

TTTCAGTTGA ACCCAACTGG TGTGGGTTGT GTTGACAACC GTGTGGGCAA CTGCTACCTG 4740 

AAGTTTGGAC CTOGAGGAGA TGGGAGTCTG TCTTGCAACA CCGAGATCGG GGTGGGCGTC 4800 

AGTCGCTCTT CATGCTGCTG CTCTCTGGGA AAGGCCTGGQ GAAACCCCTG TGAGACATGC 4860 

CCCCCTGTCA ATAGCACTGA ATATTACACC CTGTGTCCCG GAGGTGAAGG CTTCAGACCT 4920 

AACCCCATCA CAATCATTTT AGAAGACATT GACGAATGCC AGGAGTTACC AGGTCTCTGC 4980 

CAGGGTGGAA ACTGCATCAA CACTTTTGGG AGCTTCCAGT GTGAGTGCCC ACAAGGCTAC 5040 

TACCTCAGCG AGGATACCCG CATCTGTGAG GATATTGATG AGTGTTTTGC ACATCCTGGT 5100 

GTGTGTGGGC CTGGGACCTG CTATAACACC CTGGGAAATT ACACCTGCAT TTGCCCACCT 5160 

GAGTACATGC AGGTCAATGG AGGCCACAAC TGCATGGACA TGAGAAAAAG CTTTTGCTAC 5220 

CGAAGCTATA ATGGAACCAC TTGTGAGAAT GAGTTGCCTT TCAATGTGAC AAAAAGGATG 5280 

TGCTGCTGCA CATATAATGT GGGCAAAGCT GGGAACAAAC CTTGTGAACC ATGCCCAACT 5340 

CCAGGAACAG CTGACTTTAA AACCATATGT GGAAATATTC CTGGATTCAC CTTTGACATT 5400 

CACACAGGAA AAGCTGTTGA CATTGATGAA TGTAAAGAGA TTCCAGGCAT TTGTGCAAAT 5460 

GGTGTGTGCA TTAACCAGAT TGGCAGTTTC CGCTGTGAAT GCCCTACAGG ATTCAGTTAC 5520 

AATGACCTGC TGTTGGTTTG TGAAGATATA GATGAGTGCA GCAATGGTGA TAATCTCTGC 5580 

CAGCGGAATG CAGACTGCAT CAATAGTCCT GGTAGTTACC GCT GTGA ATG TGCCGCGGGT 5640 

TTCAAACTTT CACCCAATGG GGCCTGTGTA GATCGCAATG AATGTTTAGA AATTCCTAAC 5700 

GTTTGCAGTC ATGGCTTGTG TGTTGATCTG CAAGGAAGTT ACCAGTGCAT CTGCCACAAT 5760 

GGCTTTAAGG CTTCTCAGGA CCAGACCATG TGCATGGATG TTGATGAGTG CGAGCGGCAC 5820 

CCATGTGGAA ATGGAACTTG TAAAAACACC. GTTGGATCCT ATAACTGTCT GTGCTACCCA 5880 

GGGTTTGAAC TCACTCATAA TAATGATTGC CTGGACATAG ATGAGTGCAG TTCCTTTTTT 5940 

GGTCAGGTGT GCAGAAATGG ACGTTGTTTT AATGAAATTG GTTCTTTCAA GTGTCTATGT 6000 

AACGAAGGTT ATGAACTTAC CCCAGATGGC AAAAACTGTA TAGACACTAA TGAGTGTGTC 6060 

GCCCTTCCCG GCTCTTGCTC TCCTGGTACC TGTCAGAATT TGGAGGGATC CTTCAGATGC 6120 

ATCTGTCCCC CAGGGTATGA AGTAAAAAGC GAGAACTGCA TTGATATAAA TGAATGTGAT 6180 

GAAGATCCCA ACATTTGTCT TTTTGGTTCC TGTACTAATA CTCCAGGGGG CTTCCAGTGC 6240 

CTCTGCCCCC CTGGCTTTGT ACTATCTGAT AATGGACGGA GATGCTTTGA TACTCGCCAG 6300 

AGCTTCTGCT TCACAAATTT TGAAAATGGA AAGTGTTCTG TACCCAAAGC TTTCAACACC 6360 

ACAAAAGCAA AATGCTGCTG TAGTAAGATG CCAGGAGAGG GCTGGGGGGA CCCCTGTGAG 6420 

CTGTGCCCCA AAGACGATGA AGTTGCATTT CAGGATTTGT GTCCATATGG CCATGGAACT 6480 

GTCCCTAGTC TTCATGATAC ACGTGAAGAT GTCAATGAGT GTCTTGAGAG CCCAGGCATT 6540 

TGTTCAAATG GTCAATGTAT CAACACCGAC GGATCTTTTC GCTGTGAATG TCCAATGGGC 6600 

TACAACCTTG ACTACACTGG AGTACGCTGT GTGGATACTG ATGAGTGTTC AATCGGCAAT 6660 

CCGTGTGGAA ATGGTACATG CACCAATGTT ATTGGGAGTT TTGAATGCAA TTGCAATGAA 6720 

GGCTTTGAGC CAGGGCCCAT GATGAATTGT GAAGATATCA ACGAATGTGC CCAGAACCCA 6780 

CTGCTGTGTG CTTTACGCTG CATGAACACT TTTGGGTCCT ATGAATGCAC GTGCCCGATT 6840 

GGCTATGCCC TCAGGGAAGA TCAAAAGATG TGCAAAGATC TGGATGAATG TGCTGAAGGG 6900 

TTACACGACT GTGAATCTAG GGGCATGATG TGTAAGAATC TAATCGGCAC CTTCATGTGC 6960 

ATCTGCCCTC CTGGAATGGC CCGAAGGCCC GATGGAGAAG GCTGTGTAGA TGAAAATGAA 7020 

TGCAGGACCA AGCCAGGAAT CTGTGAAAAT GGACGTTGTG TTAACATTAT TGGAAGCTAT 7080 

AGATGTGAGT GTAATGAAGG ATTCCAGTCA AGTTCTTCAG GCACTGAATG CCTTGACAAT 7140 

CGACAGGGTC TCTGCTTTGC AGAGGTACTG CAGACAATAT GTCAAATGGC ATCCAGTAGT 7200 

CGCAATCTCG TCACTAAGTC AGAATGCTGC TGTGATGGTG GGCGAGGCTG GGGCCACCAG 7260 

TGCGAGCTTT GCCCACTTCC TGGAACTGCC CAGTACAAAA AGATATGTCC TCATGGCCCA 7320 

GGATATACAA CTGATGGAAG AGATATTGAT GAATGTAAGG TAATGCCAAA CCTCTGCACC 7380 

AATGGTCAGT GCATCAATAC CATGGGCTCA TTCCGATGCT TCTGCAAGGT TGGCTACACC 7440 

ACAGACATCA GTGGAACCTC TTGTATAGAC, CTTGATGAAT GCTCCCAGTC CCCGAAACCA 7500 

TGCAACTACA TCTGCAAGAA CACTGAGGGG AGTTATCAGT GTTCATGTCC GAGGGGGTAT. 7560 

GTCCTGCAAG AGGATGGAAA GACATGCAAA GACCTTGATG AATGTCAAAC AAAGCAGCAT 7620 

AACTGCCAGT TCCTCTGTGT CAACACCCTG GGGGGGTTTA CCTGTAAATG TCCACCTGGT 7680 

TTCACACAGC ATCACACTGC TTGTATCGAC AACAACGAAT GTGGGTCTCA ACCTTTGCTT 7740 

TGTGGAGGAA AGGGAATCTG TCAAAACACT CCAGGCAGTT TCAGCTGTGA ATGCCAAAGA 7800 

GGGTTCTCTC TTGATGCCAC CGGACTGAAC TGTGAAGATG TTGATGAATG TGATGGGAAC 7860 

CACAGGTGCC AACACGGCTG CCAGAACATC CTGGGTGGCT ACAGATGTGG CTGCCCCCAA 7 920 

GGCTACATCC AGCACTACCA GTGGAATCAG TGTGTCGATG AGAATGAATG CTCCAATCCC 7980 

AATGCCTGTG GCTCTGCTTC CTGCTACAAC ACCCTGGGGA GTTACAAGTG CGCCTGCCCC 8040 

TCGGGGTTCT CCTTCGACCA GTTCTCCAGT GCCTGCCACG ACGTGAATGA GTGCTCGTCC 8100 

TCCAAGAACC CCTGCAATTA CGGCTGCTCT AACACGGAGG GGGGCTACCT CTGTGGCTGC 8160 

CCCCCTGGGT ATTACAGAGT GGGACAAGGC CACTGTGTCT CAGGAATGGG ATTTAACAAG 8220 

GGGCAGTACC TGTCACTGGA TACAGAGGTC GATGAGGAAA ATGCTCTGTC CCCAGAAGCA 8280 

TGCTACGAGT GCAAAATCAA CGGCTATCCT AAGAAAGACA GCAGGCAGAA GAGAAGTATT 8340 

CATGAACCTG ATCCCACTGC TGTTGAACAG ATCAGCCTAG AGAGTGTCGA CATGGACAGC 8400 

CCCGTCAACA TGAAGTTCAA CCTCTCCCAC CTCGGCTCTA AGGAGCACAT CCTGGAACTA 8460 

AGGCCCGCCA TCCAGCCCCT CAACAACCAC ATCCGTTATG TCATCTCTCA AGGGAACGAT 8520 

GACAGCGTCT TCCGCATCCA CCAAAGGAAT GGGCTCAGCT ACTTGCACAC GGCCAAGAAG 8580 

AAGCTCATGC CCGGCACATA CACACTGGAA ATCACTAGCA TCCCTCTCTA CAAGAAGAAG 8640 

GAGCTTAAGA AACTGGAAGA GAGCAATGAG GATGACTACC TCCTAGGGGA GCTTGGGGAG 8700 

GCTCTCAGAA TGAGGCTGCA GATTCAGCTC TATTAACCGT TCACAGACTT GGGCCCAGGC 8760 

TCAAATCCTA GCACAGCCAG TCTGCAGAAG CATTTGAAAA GTCAAGGACT AATTTTAAAG 8820 

AGGAAAAATA ATAATAACTC TTGTTTCTTT CCTCCCTGTC TTAGACTTTG AATGTTGACC 8880 

CTCACAGGGA GGGATAATTT AGACTCTGGT ATGGCCAAAG ATTTGAGCTC AAAGGCAACC 8940 

GTGGTTACTG TATTTTTTAT ATAACTTCAT TTTAAAATAT ATTAAAAGAA ACCTAAATGT 9000 

TCAAGATATC AGCATATGGC ACTAAATGCA CAAAAATAAT GTGAGCTTTT TTTTTTTTTT 9060 

CCTGTTAGCA GTCTGTAACA CTTTGGGTAT TTTGCTATAG TTGCTAATTA AAAAAATATA 9120 

GATGTTTATT TATTTTTAAT GCAGTAATAT ATGGAGAAAT GAACAAACTA TGTAAACAAA 9180 

AAGGGAAACT CACTTGTTTT TCTTTAGATT TATAAATTTG AGCTATTTTT TTTAGAGGTG 9240 

CTTTTTAAAA ATCCAATAGA TACAAGAGAT GTTTCCTTTG GTTTTCTGCC AGTCATCCAG 9300 

CTGATACACA CCTGATCGAT TTTAAAGAAA GCCACACAGA GCTGAATCGG GCAGTGCTAA 9360 

TCAATAATTT AAAAGACATG AATGTCATTA GATCCTTTAT AACGTAGATC GAAGCCAAAG 9420 

CAGCTCATTT GTGACAACAT TTCATATCAC CAGACACACC AGGCAACAGA AGTTGAAGCA 9480 

CAACCACTGT AGCAAAATAC CTTGACTGCT TGTGAGACCA TTAGCATTGC AGGCCAAACC 9540 

GTACTGTATT TCCTTCTCAT AACCTCAAGG AACCATATGT GCTACCCACA ACACCTCATT 9600 
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CTTACCCAGG GTGCGCTGCG TCCTCATGGT ACTGTAGGCA GCTGAAGAAC CGCCGTTCCC 9660 
TTGAAAGGGA ACACCTGGCA TTCTGTGGTG TTTOGTGCTG TCTTAAATAA TGGTGCATTT 9720 
ATTATGTTCA AGTTATTTCA GGATTGCCAT ATGTGCAAAC AAATCATGCA ATGCAGCCAA 9780 
GOAATATATQ T TG TT GTT G T TGTTTTAAAC CCATTTTTTT TTTAGAATTT TCATTAATAC 9840 
TGT AGITATA CACCATATGC CTCATTTTAT CATAGCCTAT TGTGTATGAA AGATGTTTGT 9900 
ACAATGAATT GATGTTTAGT TTGCTTTAGT CATTTAAAAA GATATTGTAC CAGGATGTGC 9960 
TATTAAGAGC ACGTATCCAT TATTCTTCTC AACCCAAGAA CCTGTTTCCT GGACCAGTGA 10020 
CCAAACCTCA TATGTGAAAT GGCCAAAGCA CATGCAGGCT CCTGGTTGTT CCTCTCAAAC 10080 
CTGTGCTGAC CAAAGATTAG TAACCAGTTA TACCCAGTAT TTTGAGGTTT TATTGTTTTT 10140 
TTAATAACTA AAAAAAAACT CGTGCC 



Seg ID NO: 459 Protein sequence 
Protein Accession #: NP_001990.1 

1 11 21 31 41 51 

I I I I I I 

MGRRRRLCLQ LYPLWLGCW LWAQGTAGQP QPPPPKPPRP QPPPQQVRSA TAGSEGGFLA 60 

PEYREEGAAV ASRVRRRGQQ DVLRGPNVCG SRFHSYCCPG WKTLPGGNQC IVPICRNSCG 120 

DGFCSRPNMC TCSSGQISST CGSKSIQQCS VRCMNGGTCA DDHCQCQKGY IGTYCGQPVC 180 

ENGCQNGGRC IAQPCACVYG FTGPQCBRDY RTGPCFTQVN NQMCQGQLTG IVCTKTLCCA 240 

TTGRAWGHPC EMCPAQPQPC RRGFIPNIRT GACQDVDECQ AIPGICQGGN CINTVGSFEC 300 

RCPAGHKQSE TTQKCEDIDE CSIIPGICET GECSNTVGSY FCVCPRGYVT STDGSRCIDQ 3 60 

RTGMCFSGLV NGRCAQELPG RMTKMQCCCE PGRCWGIGTI PEACPVRGSE EYRRLCMDGL 420 

PMGGIPGSAG SRPGGTGGNG FAPSGNGNGY GPGGTGFIPI PGGNGFSPGV GGAGVGAGGQ 480 

GPIITGLTIL NQTIDICKHH ANLCLNGRCI PTVSSYRCEC NMGYKQDANG DCIDVDECTS 540 

NPCTNGDCVN TPGSYYCKCH AGFQRTPTKQ ACXDIDECIQ NGVLCKNGRC VNSDGSFQCI 600 

CNAGFELTTD GKNCVDHDEC TTTNMCLNGM CINEDGSPKC ICKPGFVLAP NGRYCTDVDE 660 

CQTPGICMNG HCINSEGSFR CDCPPGLAVG MDGRVCVDTH MRSTCYGGIK KGVCVRPFPG 720 

AVTKSECCCA NPDYGFGEPC QPCPAKNSAE FHGLCSSGVG ITVDGRDINE CALDPDICAN 780 

GICENLRGSY RCNCNSGYEP DASGRNCIDI DECLVNRLLC DNGLCRNTPG SYSCTCPPGY 840 

VFRTETETCE DINECESNPC VNGACRNNLG SFNCECSPGS KLSSTGLICI DSLKGTCWIU 900 

IQDSRCEVNI NGATLKSECC ATLGAAWGSP CERCELDTAC PRGLARIKGV TCEDVNECEV 960 

FPGVCPKGRC VNSKGSFHCE CPEGLTLDGT GRVCLDIRME QCYLKWDEDE CIHPVPGKFR 1020 

MDACCCAVGA AWGTECEECP KPGTKEYETL CPRGAGPANR GDVLTGRPFY KDINECKAFP 1080 

GMCTYGKCRN TIGSFKCRCN SGFALDMEBR NCTDIDECR1 SPDLCGSGIC VNTPGSFECE 1140 

CFEGYESGFM MMKNCMDIDG CERNPLLCRG GTCVNTEGSF QCDCPLGHEL SPSREDCVDI 1200 

NECSLSDNLC RNGKCVNMIG TYQCSCNPGY QATPDRQGCT DIDECMIMNG GCDTQCTNSE ■ 1260 

GSYECSCSEG YALMPDGRSC ADIDECENNP DICDGGQCTN IPGEYRCLCY DGFMASMDMK 1320 

TCIDVNECDL NSNICMFGEC ENTKGSFICH CQLGYSVKKG TTGCTDVDEC EIGAHNCDMH 1380 

ASCIiNIPGSF KCSCREGWIG NGIKC1DLDE CSNGTHQCSI NAQCVNTPGS YRCACSEGFT 1440 

GDGFTCSDVD ECAENINLCE NGQCLNVPGA YRCECEMGFT PASDSRSCQD IDECSFQNIC 1500 

VSGTCNNLPG MFHCI CDDGY ELDRTGGNCT DIDECADPIN CVNGLCVNTP GRYECNCPPD 1560 

FQLNPTGVGC VDNRVGNCYL KFGPRGDGSL SCNTEIGVGV SRSSCCCSLG KAWGNPCETC 1620 

PPVNSTEYYT LCPGGEGFRP NPITIILEDI DECQELPGLC QGGNCINTFG SFQCECPQGY 1680 

YLSEDTRICE DIDECFAHPG VCGPGTCYNT LGNYTCICPP EYMQVNGGHN CMDMRKSFCY 1740 

RSYNGTTCEN ELPFNVTKRM CCCTYNVGKA GNKPCEPCPT PGTADFKTIC GNIPGFTFDI 1800 

HTGKAVDIDE CKEIPGICAN GVCINQIGSF RCECPTGFSY NDLLLVCEDI DECSNGDNLC I860 

QRNADCINSP GSYRCECAAG FKLSPNGACV DRNECLEIPN VCSHGLCVDL OGSYQCICHN 1920 

GFKASQDQTM CMDVDECERH PCGNGTCKNT VGSYNCLCYP GPELTHNNDC LDIDECSSFF 1980 

GQVCRNGRCF NEIGSFKCLC NEGYELTPDG KNCIDTNECV ALPGSCSPGT CQNLEGSFRC 2040 

ICPPGYEVKS ENCIDINECD EDPNICLFGS CTNTPGGFQC LCPPGFVLSD NGRRCFDTRQ 2100 

SFCFTNFENG KCSVPKAFNT TKAKCCCSKM PGEGWGDPCE LCPKDDEVAF QDLCPYGHGT 2160 

VPSLHDTRED VNECLESPGI CSNGQCINTD GSFRCECPMO YNLDYTGVRC VDTDECSIGN 2220 

PCGNGTCTNV IGSFECNCNE GFEPGPMMNC EDINECAQNP LLCALRCMNT FGSYECTCPI 2280 

GYALREDQKM CKDLDECAEG LHDCESRGMM GKNLIGTFMC ICPPGMARRP DGEGCVDENE 2340 

CRTKPGICEN GRCVNIIGSY RCECNEGFQS SSSGTECLDN RQGLCFAEVL QTICQMASSS 2400 

RNLVTKSECC CDGGRGWGHQ CELCPLPGTA QYKKICPHGP GYTTDGRDID ECKVMPNLCT 2460 

NGQCINTKGS FRCFCKVGYT TD1SGTSCID LDECSQSPKP CNYICKNTEG SYQCSCPRGY 2520 

VLQEDGKTCK DLDECQTKQH NCQFLCVNTL GGFTCKCPPG FTQHHTACID NNECGSQPLL 2580 

CGGKGICQNT PGSFSCECQR GFSLDATGLN CEDVDECDGN HRCQHGCQNI LGGYRCGCPQ 2640 

GY1QHYQWNQ CVDENECSNP NACGSASCYN TLGSYKCACP SGFSFDQFSS ACHDVNECSS 2700 

SKNPCNYGCS NTEGGYLCGC PPGYYRVGQG HCVSGMGFNK GQYLSLDTEV DEENALSPEA 2760 

CYECKINGYP KKDSRQKRSI HEPDPTAVEQ ISLESVDMDS PVNMKFNLSH LGSKEHILEL 2820 

RPAIQPLNNH IRYVISQGND DSVFRIHQRN GLSYLHTAKK KLMPGTYTLE ITSIPLYKKK 2 8 BO 
ELKKLEESNE DDYLLGELGE ALRMRLQIQL Y 

Seq ID NO j 460 DNA sequence 

Nucleic Acid Accession ft: NM_013372.1 

Coding sequence: 63.. 617 

1 11 21 31 41 51 

I I I I I I 

GCGGCCGCAC TCAGCGCCAC GCGTCGAAAG CGCAGGCCCC GAGGACCCGC CGCACTGACA 60 

GTATGAGCCG CACAGCCTAC ACGGTGGGAG CCCTGCTTCT CCTCTTGGGG ACCCTGCTGC 120 

CGGCTGCTGA AGGGAAAAAG AAAGGGTCCC AAGGTGCCAT CCCCCCGCCA GACAAGGCCC 180 

AGCACAATGA CTCAGAGCAG ACTCAGTCGC CCCAGCAGCC TGGCTCCAGG AACCGGGGGC 240 

GGGGCCAAGG GCGGGGCACT GCCATGCCCG GGGAGGAGGT GCTGGAGTCC AGCCAAGAGG 300 

CCCTGCATGT GACGGAGCGC AAATACCTGA AGCGAGACTG GTGCAAAACC CAGCCGCTTA 360 

AGCAGACCAT CCACGAGGAA GGCTGCAACA GTCGCACCAT CATCAACCGC TTCTGTTACG 420 

GCCAGTGCAA CTCTTTCTAC ATCCCCAGGC ACATCCGGAA GGAGGAAGGT TCCTTTCAGT 480 

CCTGCTCCTT CTGCAAGCCC AAGAAATTCA CTACCATGAT GGTCACACTC AACTGCCCTG 540 

AACTACAGCC ACCTACCAAG AAGAAGAGAG TCACACGTGT GAAGCAGTGT CGTTGCATAT 600 

CCATCGATTT GGATTAAGCC AAATCCAGGT GCACCCAGCA TGTCCTAGGA ATGCAGCCCC 660 

AGGAAGTCCC AGACCTAAAA CAACCAGATT CTTACTTGGC TTAAACCTAG AGGCCAGAAG 720 

AACCCCCAGC TGCCTCCTGG CAGGAGCCTG CTTGTGCGTA GTTCGTGTGC ATGAGTGTGG 780 

ATGGGTGCCT GTGGGTGTTT TTAGACACCA GAGAAAACAC AGTCTCTGCT AGAGAGCACT 840 

CCCTATTTTG TAAACATATC TGCTTTAATG GGGATGTACC AGAAACCCAC CTCACCCCGG 900 
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CTCACATCTA AAGGGGOGGG GCCGTGGTCT GGTTCTGACT TTGTGTTTTT GTGCCCTCCT 960 

GGGGACCAGA ATCTCCTTTC GGAATGAATG TTCATGGAAG AGGCTCCTCT GAGGGCAAGA 1020 

GACCTGTTTT AGTGCTGCAT TCGACATGGA AAAGTCCTTT TAACCTGTGC TTGCATCCTC 1080 

CTTTCCTCCT CCTCCTCACA ATCCATCTCT TCTTAAGTTG ATAGTGACTA TGTCAGTCTA 1140 

ATCTCTTGTT TGCCAAGGTT CCTAAATTAA TTCACTTAAC CATGATGCAA ATGTTTTTCA 1200 

TTTTGTGAAG ACCCTCCAGA CTCTGGGAGA GGCTGGTGTG GGCAAGGACA AGCAGGATAG 1260 

TGGAGTGAGA AAGGGAGGGT GGAGGGTGAG GCCAAATCAG GTCCAGCAAA AGTCAGTAGG .1320 

GACATTGCAG AAGCTTGAAA GGCCAATACC AGAACACAGG CTGATGCTTC TGAGAAAGTC 1380 

TTTTCCTAGT ATTTAACAGA ACCCAAGTGA ACAGAGGAGA AATGAGATTG CCAGAAAGTG 1440 

ATTAACTTTG GCCGTTGCAA TCTGCTCAAA CCTAACACCA AACTGAAAAC ATAAATACTG 1500 

ACCACTCCTA TGTTCGGACC CAAGCAAGTT AGCTAAACCA AACCAACTCC TCTGCTTTGT 1560 

CCCTCAGGTG GAAAAGAGAG GTAGTTTAGA ACTCTCTGCA TAGGGGTGGG AATTAATCAA 1620 

AAACCKCAGA GGCTGAAATT CCTAATACCT TTCCTTTATC GTGGTTATAG TCAGCTCATT 1680 

TCCATTCCAC TATTTCCCAT AATGCTTCTG AGAGCCACTA ACTTGATTGA TAAAGATCCT 1740 

GCCTCTGCTG AGTGTACCTG ACAGTAAGTC TAAAGATGAR AGAGTTTAGG GACTACTCTG 1800 

TTTTAGCAAG ARATATTKTG GGGGTCTTTT TGTTTTAACT ATTGTCAGGA GATTGGGCTA 1860 

RAGAGAAGAC GACGAGAGTA AGGAAATAAA GGGRATTGCC TCTGGCTAGA GAGTAAGTTA 1920 

GGTGTTAATA CCTGGTAGAA ATGTAAGGGA TATGACCTCC CTTTCTTTAT GTGCTCACTG 1980 

AGGATCTGAG GGGACCCTGT TAGGAGAGCA TAGCATCATG ATGTATTAGC TGTTCATCTG 2040 

CTACTGGTTG GATGGACATA ACTATTGTAA CTATTCAGTA TTTACTGGTA GGCACTGTCC 2100 

TCTGATTAAA CTTGGCCTAC TGGCAATGGC TACTTAGGAT TGATCTAAGG GCCAAAGTGC 2160 

AGGGTGGGTG AACTTTATTG TACTTTGGAT TTGGTTAACC TGTTTTCTTC AAGCCTGAGG 2220 

TTTTATATAC AAACTCCCTG AATACTCTTT TTGCCTTGTA TCTTCTCAGC CTCCTAGCCA 2280 

AGTCCTATGT AATATGGAAA ACAAACACTG CAGACTTGAG ATTCAGTTGC CGATCAAGGC 2340 

TCTGGCATTC AGAGAACCCT TGCAACTCGA GAAGCTGTTT TTATTTCGTT TTTGTTTTGA 2400 

TCCAGTGCTC TCCCATCTAA CAACTAAACA GGAGCCATTT CAAGGCGGGA GATATTTTAA 2460 

ACACCCAAAA TGTTGGGTCT GATTTTCAAA CTTTTAAACT CACTACTGAT GATTCTCACG 2520 

CTAGGCGAAT TTGTCCAAAC ACATAGTGTG TGTGTTTTGT ATACACTGTA TGACCCCACC 2580 

CCAAATCTTT GTATTGTCCA CATTCTCCAA CAATAAAGCA CAGAGTGGAT TTAATTAAGC 2640 

ACACAAATGC TAAGGCAGAA TTTTGAGGGT GGGAGAGAAG AAAAGGGAAA GAAGCTGAAA 2700 

ATGTAAAACC ACACCAGGGA GGAAAAATGA CATTCAGAAC CAGCAAACAC TGAATTTCTC 2760 

TTGTTGTTTT AACTCTGCCA CAAGAATGCA ATTTCGTTAA TGGAGATGAC TTAAGTTGGC 2B20 

AGCAGTAATC TTCTTTTAGG AGCTTGTACC ACAGTCTTGC ACATAAGTGC AGATTTGGCT 2880 

CAAGTAAAGA GAATTTCCTC AACACTAACT TCACTGGGAT AATCAGCAGC GTAACTACCC 2940 

TAAAAGCATA TCACTAGCCA AAGAGGGAAA TATCTGTTCT TCTTACTGTG CCTATATTAA 3000 

GACTAGTACA AATGTGGTGT GTCTTCCAAC TTTCATTGAA AATGCCATAT CTATACCATA 3060 

TTTTATTCGA GTCACTGATG ATGTAATGAT ATATTTTTTC ATTATTATAG TAGAATATTT 3120 

TTATGGCAAG ATATTTGTGG TCTTGATCAT ACCTATTAAA ATAATGCCAA ACACCAAATA 3180 

TGAATTTTAT GATGTACACT TTGTGCTTGG CATTAAAAGA AAAAAACACA CATCCTGGAA 3240 

GTCTGTAAGT TGTTTTTTGT TACTGTAGGT CTTCAAAGTT AAGAGTGTAA GTGAAAAATC 3300 

TGGAGGAGAG GATAATTTCC ACTGTGTGGA ATGTGAATAG TTAAATGAAA AGTTATGGTT 3360 

ATTTAATGTA ATTATTACTT CAAATCCTTT GGTCACTGTG ATTTCAAGCA TGTTTTCTTT 3420 

TTCTCCTTTA TATGACTTTC TCTGAGTTGG GCAAAGAAGA AGCTGACACA CCGTATGTTG 3480 

TTAGAGTCTT TTATCTGGTC AGGGGAAACA AAATCTTGAC CCAGCTGAAC ATGTCTTCCT 3540 

GAGTCAGTGC CTGAATCTTT ATTTTTTAAA TTGAATGTTC CTTAAAGGTT AACATTTCTA 3600 

AAGCAATATT AAGAAAGACT TTAAATGTTA TTTTGGAAGA CTTACGATGC ATGTATACAA 3660 

ACGAATAGCA GATAATGATG ACTAGTTCAC ACATAAAGTC CTTTTAAGGA GAAAATCTAA 3720 

AATGAAAAGT GGATAAACAG AACATTTATA AGTGATCAGT TAATGCCTAA GAGTGAAAGT 3780 

AGTTCTATTG ACATTCCTCA AGATATTTAA TATCAACTGC ATTATGTATT ATGTCTGCTT 3840 

AAATCATTTA AAAACGGCAA AGAATTATAT AGACTATGAG GTACCTTGCT GTGTAGGAGG 3900 

ATGAAAGGGG AGTTGATAGT CTCATAAAAC TAATTTGGCT TCAAGTTTCA TGAATCTGTA 3960 

ACTAGAATTT AATTTTCACC CCAATAATGT TCTATATAGC CTTTGCTAAA GAGCAACTAA 4020 
TAAATTAAAC CTATTCTTTC AAAAAAAAA 

Seq ID NO: 461 Protein sequence 
Protein Accession #: NP_037504.1 

1 11 21 31 41 51 

I I I I I I 

MSRTAYTVGA LLLLLGTLLP AAEGKKKGSQ GAIPPPDKAQ HNDSEQTQSP QQPGSRNRGR 60 

GQGRGTAMPG EEVLESSQEA LHVTERKYLK RDWCKTQPLK QTIHEEGCNS RTIINRFCYG 120 

QCNSFYIPRH IRKEEGSFQS CSFCKPKXFT TMMVTLNCPE LQPPTKKKRV TRVKQCRCIS 180 
IDLD 

Seq ID NO: 462 DNA sequence 

Nucleic Acid Accession 8: Eos sequence 

Coding sequence: 1..2733 

1 11 21 31 41 51 

I I I I I I 

ATGAAAGTTG GAGTGCTGTG GCTCATTTCT TTCTTCACCT TCACTGACGG CCACGGTGGC 60 

TTCCTGGGGA AAAATGATGG CATCAAAACA AAAAAAGAAC TCATTGTGAA TAAGAAAAAA 120 

CATCTAGGCC CAGTCGAAGA ATATCAGCTG CTGCTTCAGG TGACCTATAG AGATTCCAAG 180 

GAGAAAAGAG ATTTGAGAAA TTTTCTGAAG CTCTTGAAGC CTCCATTATT ATGGTCACAT 240 

GGGCTAATTA GAATTATCAG AGCAAAGGCT ACCACAGAC7 GCAACAGCCT GAATGGAGTC 300 

CTGCAGTGTA CCTGTGAAGA CAGCTACACC TGGTTTCCTC CCTCATGCCT TGATCCCCAG 360 

AACTGCTACC TTCACACGGC TGGAGCACTC CCAAGCTGTG AATGTCATCT CAACAACCTC 420 

AGCCAGAGTG TCAATTTCTG TGAGAGAACA AAGATTTGGG GCACTTTCAA AATTAATGAA 480 

AGGTTTACAA ATGACCTTTT GAATTCATCT TCTGCTATAT ACTCCAAATA TGCAAATGGA 540 

ATTGAAATTC AACTTAAAAA AGCATATGAA AGAATTCAAG GTTTTGAGTC GGTTCAGGTC 600 

ACCCAATTTC GAAATGGAAG CATCGTTGCT GGGTATGAAG TTGTTGGCTC CAGCAGTGCA 660 

TCTGAACTGC TGTCAGCCAT TGAACATGTT GCCGAGAAGG CTAAGACAGC CCTTCACAAG 720 

CTGTTTCCAT TAGAAGACGG CTCTTTCAGA GTGTTCGGAA AAGCCCAGTG TAATGACATT 780 

GTCTTTGGAT TTGGGTCCAA GGATGATGAA TATACCCTGC CCTGCAGCAG TGGCTACAGG 840 

GGAAACATCA CAGCCAAGTQ TGAGTCCTCT GGGTGGCAGG TCATCAGGGA GACTTGTGTG 900 

CTCTCTCTGC TTGAAGAACT GAACAAGAAT TTCAGTATGA TTGTAGGCAA TGCCACTGAG 960 

GCAGCTGTGT CATCCTTCGT GCAAAATCTT TCTGTCATCA TTCGGCAAAA CCCATCAACC 1020 
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ACAGTGGGGA ATCTGGCTTC GGTGGTGTCG ATTCTGAGCA ATATTTCATC TCTGTCACTG 1080 

GCCAGCCATT TCAGGGTGTC CAATTCAACA ATGGAGGATG TCATCAGTAT AGCTGACAAT 1140 

ATCCTTAATT CAGCCTCAGT AACCAACTGG ACAGTCTTAC TGCGGGAAGA AAAGTATGCC 1200 

AGCTCACGGT TACTAGAGAC ATTAGAAAAC ATCAGCACTC TGGTGCCTCC GACAGCTCTT 1260 

CCTCTGAATT TTTCTCGGAA ATTCATTGAC TGGAAAGGGA TTCCAGTGAA CAAAAGCCAA 1320 

CTCAAAAGGG GTTACAGCTA TCAGATTAAA ATGTGTCCCC AAAATACATC TATTCCCATC 1380 

AGAGGCCGTG TGTTAATTGG GTCAGACCAA TTCCAGAGAT CCCTTCCAGA AACTATTATC 1440 

AGCATGGCCT CGTTGACTCT GGGGAACATT CTACCCGTTT CCAAAAATGG AAATGCTCAG 1500 

GTCAATGGAC CTGTGATATC CACGGTTATT CAAAACTATT CCATAAATGA AGTTTTCCTA 1560 

TTTTTTTCCA AGATAGAGTC AAACCTGAGC CAGCCTCATT GTGTGTTTTG GGATTTCAGT 1620 

CATTTGCAGT GGAACGATGC AGGCTGCCAC CTAGTGAATG AAACTCAAGA CATCGTGACG 1680 

TGCCAATGTA CTCACTTGAC CTCCTTCTCC ATATTGATGT CACCTTTTGT CCCCTCTACA 1740 

ATCTTCCCCG TTGTAAAATG GATCACCTAT GTGGGACTGG GTATCTCCAT TGGAAGTCTC 1800 

ATTTTATGCC TGATCATCGA GGCTTTGTTT TGGAAGCAGA TTAAAAAAAG CCAAACCTCT 1860 

CACACACGTC GTATTTGCAT GGTGAACATA GCCCTGTCCC TCTTGATTGC TGATGTCTGG 1920 

TTTATTGTTG GTGCCACAGT GGACACCACG GTGAACCCTT CTGGAGTCTG CACAGCTGCT 1980 

GTGTTCTTTA CACACTTCTT CTACCTCTCT TTGTTCTTCT GGATGCTCAT GCTTGGCATC 2040 

CTGCTGGCTT ACCGGATCAT CCTCGTGTTC CATCACATGG CCCAGCATTT GATGATGGCT 2100 

GTTGGATTTT GCCTGGGTTA TGGGTGCCCT CTCATTATAT CTGTCATTAC CATTGCTGTC 2160 

ACGCAACCTA GCAATACCTA CAAAAGGAAA GATGTGTGTT GGCTTAACTG GTCCAATGGA 2220 

AGCAAACCAC TCCTGGCTTT TGTTGTCCCT GCACTGGCTA TTGTGGCTGT GAACTTCGTT 2280 

GTGGTGCTGC TAGTTCTCAC AAAGCTCTGG AGGCCGACTG TTGGGGAAAG ACTGAGTCGG 2340 

GATGACAAGG CCACCATCAT CCGCGTGGGG AAGAGCCTCC TCATTCTGAC CCCTCTGCTA 2400 

GGGCTCACCT GGGGCTTTGG AATAGGAACA ATAGTGGACA GCCAGAATCT GGCTTGGCAT 2460 

GTTATTTTTG CTTTACTCAA TGCATTCCAG GGATTTTTTA TCTTATGCTT TGGAATACTC 2520 

TTGGACAGTA AGCTGCGACA ACTTCTGTTC AACAAGTTGT CTGCCTTAAG TTCTTGGAAG 2580 

CAAACAGAAA AGCAAAACTC ATCAGATTTA TCTGCCAAAC CCAAATTCTC AAAGCCTTTC 2640 

AACCCACTGC AAAACAAAGG CCATTATGCA TTTTCTCATA CTGGAGATTC CTCCGACAAC 2700 
ATCATGCTAA CTCAGTTTGT CTCAAATGAA TAA 

Seq ID NO: 463 Protein sequence 
Protein Accession #i Eos sequence 

1 11 21 31 41 51 

I I I I I I 

MKVGVLWLIS FFTFTDGHGG FLGKNDGIKT KKELIVNKKK HLGPVEEYQL LLQVTYRDSK 60 

EKRDLRNFLK LLKPPLLWSH GLIRIIRAKA TTDCNSLNGV LQCTCEDSYT WFPPSCLDPQ 120 

NCYLHTAGAL PSCECHLNNL SQSVNFCERT KIWGTFKINE RFTNDLLNSS SAIYSKYANG 180 

IEIQLKKAYB RIQGFESVQV TQFRNGSIVA GYEWGSSSA SELLSAIEHV AEKAKTALHK 240 

LFPLEDGSFR VFGKAQCNDI VFGFGSKDDE YTLPCSSGYR GNITAKCESS GWQVIRETCV 300 

LSLLEELNKN FSMIVGNATE AAVSSFVQNfc SVIIRQNPST TVGNIASWS IIiSNISSLSL 360 

ASHFRVSNST MEDVISIADN ILNSASVTNW TVLLREEKYA SSRLLETLEN ISTLVPPTAL 420 

PLNFSRKFID WKGIPVNKSQ LKRGYSYQIK MCPQNTSIPI RGRVLIGSDQ FQRSIjPETII 480 

SMASLTLGNI LPVSKNGNAQ VNGPVISTVI QNYSINEVFL FFSKIESNLS QPKCVFWDFS 540 

HLQWNDAGCH LVNETQDIVT CQCTHLTSFS ILMSPFVPST IFPWKWITY VGLGISIGSL 600 

ILCLIIEALF WKQIKKSQTS HTRRICMVNI ALSLLIADVW FIVGATVDTT VNPSGVCTAA 660 

VFFTHFFYLS LFFWMLMLGI LLAYRI IIiVF HHMAQHLMMA VGFCLGYGCP LIISVITIAV 720 

TQPSNTYKRK DVCWLNWSNG SKPLLAFWP ALAIVAVNFV WLLVLTKLW RPTVGERLSR 780 

DDKATIIRVG KSLLILTPLL GLTWGFGIGT IVDSQNLAWH VIFALLNAFQ GFFIIjCFGII* 840 

LDSKLRQLLF NKLSALSSWK QTEKQNSSDL SAKPKFSKPF NPLQNKGHYA FSHTGDSSDN 900 
IMLTQFVSNE 

Seq ID NOt 464 dna sequence 

Nucleic Acid Accession fl: AB035089.1 

Coding sequence: 9845.. 10219 

1 11 21 31 41 51 

I I 1 I I I 

GGGCATGCAG CCATCGGGGA AAATCCATAG TGCAGATAAA GCAAGGAGGA AGAAGAAGGA 60 

CAGTTCTAGT AAAAGGGAGA ACATCAATAT AGGATGTTTC TTAGCAATAG AAAAAGAAGG 120 

CCAAGAGGAA TTAGGGAGAG AGTTATAAGA GATCAGCAAG GGGACAGGGT TAGATTTGGT 180 

TTGGTTTGAA AGCATACAGT AAATATGATG TCTGTCCCTG GCAGTGTTGG CAGAGTAGGA 240 

AGGAGGAAGG GAGGCAAGAG ATAATATCAT TTTCTCTGTG CTCCAACTGT ACTTACATAT 300 

GAGACTATTT CCCTCTCTGC TTTTCAAACC TTACTGGAGT TGTTTTCCCT CATGAAAACC 360 

AAGAAAGGAA AGCTAGTTAG TCTTGTTCTG AGGTTGTTCA ATGTATACAT ATCTATATCT 420 

GTAGACAGAA TCCTTGGGAA TACAGTAATT GACATATATT CTGTTATTTG ATGCTTGAAA 480 

AATCTCCTCC ACTAACCAGT TTCCCTATAG ATTGCCACAA GCACATAATA AGAAACAATA 540 

AATAAAATGT TCTCTTGACT TTGTTACTTA ACAATGCTGA GAAAACTTTA CAGCCTTCAT 600 

AAGGAAGTGA GGTCCAGGAA AATCTAGGAG ATATTTCTTA ACCAATCTAT AAAGGCATTA 660 

GTAATGACAG GATATTTCCT GAAAGTGTAA TTTCCCATTG AGGATTTGTT TTTAATTTCT 720 

GGATTCCTGG AGCCAATGAA GTTGGTGTAT GTTTATGAAA TATCAAGAGA CATAAGTTGG 780 

CAAGTGTTCA TATGCAAAAA CTTCTTGGAA TTTCTGAGTT CTCTGTGGCA ATATATGACA 840 

TCAGGATATG TCCAGTCTCA CACACCAGGA TATGTCCTTT CTAGCCTGTC TATCACATGC 900 

TAGGAGAACT ATTTAGGAAC AGAAAAAAAT GCCTGAAATG ATTTCTCATT TGAACTCATC 960 

CAAGCTTTCT CTAAATTTAA GCAAACTCCT GGTCATTTTC AGTTAGTACC TTTCCTTAAG 1020 

TTCAACCTTC AGGGCAAACC TCCGTGCCTC AGACGTTTAG CCATAGTCTG AAATTCTCTT 1080 

CCATAGATTG GTCCCCTGTA ACCCCGGTTT GTCTCAGCTT GTTATCCTGT TT TTTT CTTC 1140 

CCTCCATTCC CAGGATGAGC TTGTTGCTTC TGTCCTATGA GACATTAGAT TCCTTTTCTT 1200 

TGGTACCCGA GTAAATCCAT CCTACTCCAA TAGAGGAAGG TCCATTTTTG TCTTATAGCG 1260 

CTGGATGCAG ACTCAGCTGA GAAGACCATT ATTCATTTTT GGAATTCTTT ATCTCAGATA 1320 

TTTCCTCTTC TTTCTTTTTC TTCTATCTTT GGATTTTTAG TCCATCAACG CCCCATTAGT 1380 

CTATTCCCCG ACTTCAATCA GGGAACTTAT ACCTCTTAAA CTCATTCAGA GACTCAAAAC 1440 

ATATATATTG ATACAGGAGA CCTAAGAAGA GCATGTCTTG GGGGTTGAGG AAACAGGCAG 1500 

GTGAGAAATT TCCAGATTGG AAACACAGCT TCCTTTCTCC CATCCAGCCC CTACTTTCAG 1560 

CCTATGTGTT TCTGGCACCT TGTTGTAGAT AAATCTCCCT TGACTTTGTG ATGTGCTGAG 1620 

AAAACAAACT CACGGCTGGT GTTAAAAAGG GCCCATGACA ATACCAAGTG TTGGGGAGAA 1680 

TGTGGAGAAA TCAGAACTCT ATTCAOGGTC GGTTGGAATG CACACTTGTG CAGAATTCTA 1740 
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TGGAGAAGAG TCTGGCATTT CCTCAAAATG TTAACCTGGA TTTACCATAT GACCCAGCGA 1800 

TTTCATTCAT AGGTTTATAC TCAAAAGAAA TGAAGAAATA TGCCATGCAA AAAAATGTAC 18S0 

ATGAAAGGTC ACAACATCAT TATTCATAAT AGTAAAAGGA TGGAAACAAC ACAAATGTCC 1920 

ATCAACTTAT GATTAAAGAA AATCTGGTCT ATTCATAGAA TGGAATATTA TTCGACCACA 1980 

AAAAGGAATG ATGTACTGAT CCATGCAATG ATGTGGACAA ACCATGAAAA TAACACTAGA 2040 

TTAAAGAAGC CAGTCACAAA AGGACTTACT GTATGATTCC ATTTACCTGA AATGTTTGGA 2100 ' 

ATAGGCAAAT CCATAGAAAC AGGAGGTAGA TTCCTGGTTT CCAGGGTCTC CAGGAAGGGA 2160 

AGAATGAAGT ACAAGATTTC TTTTGGAGGT AGTGAAATTG TTGTGGAATG AGATCATGAT 2220 

GATGATAGCA CAACTTTGTG AATATAATAA AATCATTGAA TTGTACAGTT GAATTTATGG 2280 

TATATAAATT ATATGTTAAT AAAAAGGGGG TCCACAAAAC AAACAGCCCC CCACTCTGGT 2340 

TGTCAGGGAG ATATTGGATT AAATGGCCTT GGACAACAAC CCCTCTCCCT GGCCACAGAC 2400 

ATTCTTCAGA TTACAAGATA TTCCAGGGGA AACACTGGAA TGAGTCTGAA GCCAGGTGCT 2460 

AAACAGAAGG ACCATTGAGA AATGTTGTGA TCCTGACAGG TCAAGCAATT TATTTTTCGG 2520 

CTTCATTTTT AAATGTAAAA TTAGAAAGCT GCCATTTAAA ATGGCCCGTC TGTTTCAATT 2580 

GCTCTTCTCA GTGTCAGCCT GTTAACTCAA TGTGTTAGTC TGTTTTCATG CTGCTGATAA 2640 

AAACATACCT GAGACTGGCA AGAAAAAGAG GTTTAATTGG GCTTAGAGTT CCACGTGATT 2700 

GGGGAGGCCT CAGAATCACA GTAGGAGGCA AAAGTTATTC TTACATGGTG GCTGCAAGAG 2760 

AAGATGAGGA AGAAGCAAAA GAAGAAACCC CTGATAAACC CATCGGATCT CCTGAGGCTT 2820 

ATTAACTATC ATGAGAATAG CACAAGAAAG ACCGGCCCCC ATGATTCAAT TACCTCTACC 2880 

TGGGTCCCTC CAATAACATG TGGAAATTCT GGTAGATACA ATTCAAGTTG AGATTTGGGT 2940 

GGGAACACAG CCAAACCATA TCACTCAGCA AGGCAGATAA CTTTCTCACT GAGCCTATGC 3000 

AACAGAAAAC CATCTGGGAT GGTTGTAAGG GGCACAGGAA GTGACTGGTA GGATCACTGC 3060 

CAAAGCTGAG CACTCAGGAG AAGGCAATAG AATCCTATTC TCCATAGTAT GCTATAAGAT 3120 

ACTGAAGTAC ACTTCTTCAC TATCTCTTTG GACTTAGAAT TAGCACTACA TTCCTTGTTA 3180 

TACAGAAAAA TTACTAAGGA AATTCATAGG ATGACAAAAA CTTTCAGAAC TGAAAAACAG 3240 

GAAATGTAAG CTTTTTAGTT CTTTGGTATT CGAAGTATGC CTAAAAGACA ATGCAAAATC 3300 

CAAGAAAAGA ATGGTGGGGT TTTTGTTTGT TTGGTTTTGT TTTTGTTTTA CAGCTGGAGT 3360 

AGAATACAAA GGGATGGAGT ' TGAAACAAAT ' GAGAGGAAAT TGGAAT TCTA AACTT ATTCT 3420 

CATTGGCATT AGAAAGGCAC CTACATGTAT TTCACATGAG CCGGTGACTG CTGACTTGCA 3480 

TTCTTATTTT TTCCCTATAG ATTAAAAAGG AGGTACAATG GTAGAACTGT AATCCTGTCC 3540 

TTTGTCATAA ATTTTCATAT TCATAAAGGT GAGTGTTAGC CCGCTTGTGA AATCTGAAGT 3600 

TGAGTAACTT CAAATACTAA CCACAGAGGG AAAGGCAGCA AGAGGAGAGG CATAAATTTA 3660 

GGATCTCACC CTTCATTCCA CAGACACACA CAGCCTCTCT GCCCACCTCT GCTTCCTCTA 3720 

GGAACACAGG TAAGAGCTTC AAGCCTCTCC AGCTTAATAA CATGAATTAT TTTTGAGAAT 3780 

AATAATGATA CTGTGTTCTA TATCATGCAT CTCCTGCATT CTGTCTGATT ATATTTTACT 3840 

TATTCTGCCA GAGCAAAATT AAAATACCTA TTTCATCTGA TTTGTCCTTT ATCTAAATTG 3900 

CTTAGTTCCA AGTAAACCAA GGCACTTTTA GGAACACAGA GGGAGAGTGC CTTGCAGCCA 3960 

GAGAGTCTTG AAGGAGATGT CAGGGACGCA TCTTAACAGC TGGTTGGATG TGATCCACAG 4020 

AGGTCTCCTG TTAGCATTCA TTGTAAAGCC ATCCTACCTA GCTCTAGTGT AACCAGCAAT 4080 

GAAAGAAAGA TAAAGAGGGT CGATTACTTA TTTACAATAG TCTTTAAAAA CGTAGTTTTG 4140 

TAAGCCTTCT AATTAGGACA TTAATATATT TAATATATGC ACATTGTAGA AAGATTGAAG 4200 

CGTTAAAAAT AAGAGAAAAA CTTTAAATGT CAAAATCTCA CAACCCAGAT ATATCATTTC 4260 

TTTAAGAAAA TTGTACTACA AAATACCATT CCATTTATTA AAGTCATTCT GACAGGAATC 4320 

TGATGCTTTT CCAGGAGTTC CAGATCACAT CGAGTTCACC ATGAATTCAC TCAGTGAAGC 4380 

CAACACCAAG TTCATGTTCG ATCTGTTCCA ACAGTTCAGA AAATCAAAAG AGAACAACAT 4440 

CTTCTATTCC CCTATCAGCA TCACATCAGC ATTAGGGATG GTCCTCTTAG GAGCCAAAGA 4500 

CAACACTGCA CAACAAATTA GCAAGGTAGC TATCAGCATC ATTACGTTGT CCTGTTGCAG 4560 

TTTTTCTCTG GTTCCGTCGG CTAGCACGCA GATGGTAATA GATGTGGTGG TCTGATGGGT 4620 

AGCACAGGGG GCTGTGCAGG AATTCCCATA ACTGTGAGAC CACTGACTTA AACAGATCTT 4680 

TTGAGTAAAG TTTTCTTGTC CCGCTTCATG TCTCTTCCAG GTTCTTCACT TTGATCAAGT 4740 

CACAGAGAAC ACCACAGAAA AAGCTGCAAC ATATCATGTG AGTCACAGAG CACTCTGATT 4800 

CAGCTTTAGA TCCCTGAACA GGTCATAGTT TAAACCTGGA ACTTCACAAA AACTAAGAAA 4860 

AGGCCAGTTT TAGGGAAAAT CTTGGACACA AAGATTGAGA CATACAGAGT GGGTTGGCAT 4920 

TTCATGGCAC ATAATTATTA TTCCTCATTT CTGCGTTACT AAAAGACAGT CAGCACTGTA 4980 

CCTCAGAGCA TAGGTCTGGA TCAGGATAGG CTGGGTTCAG ACTCCAGCTT TGCTCTTCAC 5040 

AAATGATGAA TAAGAGCAGG ACACAACTGC TCGGAGTCCC AGTGACCTCA TCCCAGAAAA 5100 

CTAAGGGTAA GAAAAAATCT GACTCAATAC ATGCAAATAC ATGCAAATGT TTACAACAGT 5160 

GCCTTGCCCA TAAAAGTCAT AATAAATGTT ATTATTATTA TAAAGTAGCT ATAATTATAC 5220 

TAATCATAAT AATGTGAAAA TAATTTAATT TTCATTGAGT CATTAATGAG ATTCAGAGGA 5280 

ATAAGCACAA GTCCAAGTAT ATTTTGGAAA ATGATTGCTA TGGAATATAT TGGTTTAGAG 5340 

CCTTAATAGT GCAAAATGCT TTGCTGGAAG GTAGAAAGTT CTAGATTTAA ACAGGCTTAG 5400 

GTTCAAAACT TGGCACTTCT AATTTATGTC TCTATAAACA GGGTTTTTTT CCCCATTCTC 5460 

TGAGCTTTCT TGTGTTCATC TGAATTGAAC TAAAGACTTA GAGTTACCCA TGTAAAGTCC 5520 

TTAGCCATGG ACCTGGCATA CACTCTTCTT ACGTGCAGAG AATGACCATC ATGAGGAAAG 5580 

AGCCACAGAT CAGTCAATGT GTCCTACAAG ATAATAGCAC CAACAGGTAT AACAGGGCTT 5640 

CCTGGCATAA TCTATTTAAA ATATCCAACC TTCAACATAC TCGTATCCTT GATGACTGTT 5700 

AGAAGTGAAA TATGGTCCTT GCCCATAAGG AGCTGAGAGT TTAACTGGGA AGCTAAACCT 5760 

AACCCTTTAA ACCAACAAGG AGAAAATCTA CTGGTAGACA GCGCTGCATC TTTAGTTCAG 5820 

AAGAGAAAAG ATTGCAGTAC GTTAGAGCAA GAAGAATTTT CTGGAAGAAG TCAAATATAA 5880 

GGTGGATTTT GAAGGGTATT TGAGGTGAAA TACACCAATT ATCAGGGAAT AACATCAAAG 5940 

GTCCTCAATG AGACTACCAG CATTTAGGGA CTGATCTAAC AGACTTAGCA TGGGTTTAGT 6000 

ATTTACATTG ATACAGCAAT TGAATGATCT CCTTTTTTGA TGTTTGAAGG TTGATAGGTC 6060 

AGGAAATGTT CATCACCAGT TTCAAAAGCT TCTGACTGAA TTCAACAAAT CCACTGATGC 6120 

ATATGAGCTG AAGA7CGCCA ACAAGCTCTT CGGAGAAAAG ACGTATCAAT TTTTACAGGT 6180 

AATTTCACCT GGCCTACCCA CATTTCATTT GCATCCTGAT GTCTGTGTCT CTGAGTGGCC 6240 

AAATGGAAGA AAGCAAGGCA GATGAGCCTG GCCGACCCAG GTGGAGAGCA TTTACTCAGA 6300 

GTGCATTAGC TCCATTTCCA CAACTCTCCC CCACTGGAGT GTCCCAGACC CCAACGATAC 6360 

ATCACTGAAG TGTGGATTTA GGGATAATCT TGTGATAAAA GAGGAGGTTG TGTAATAGAG 6420 

TGAGTAAGAG TAATAAGTAA TAAGATACCA TCGATAAACT GGCACTGACT CAGTCACATA 6480 

CGATACATCT TGGTGGGAAA TGTATGACTA ATGGGATATT ATTGGAATGG GCAGGCTTGG 6540 

GTGAGTTCCT GAGAATAGTT GAGGAAGTAC CAGGAAATAT TGAATGCACA GGATGAAAGA 6600 

CAAAAACAAA GATCAGAAAC ATCATGGTTA AAATTACTGG AGAGAAGTCT GAGAAGCAAT 6660 

GAATCTCCTT CAGGGAAGCC TGCTCTGCAG TTTGCAAACC ACAGCCTCTT CTGCTTCTGC 6720 

CTTTTGCCAA GATGATATTG ACCTTCAGTG ACCTCTTTCT TGTGCCAGCC CACATTCCCC 6780 

TTTTGCATTG CCTACATGAC ACCTGTATAA AAATATCCAT GGACAGGAGA TACTGCATCT 6840 

ATTCAGGGTC TGGATTCAGC TTACTGTTGT TACAAATAAG TAAGTTTGGT AATATATAGT 6900 

TACATAAATT ACTCCTAATT CCTACTTCTT CCTTCATATC TCAAAGGAAT ATTTAGATGC 6960 



363 



WO 02/086443 

CATCAAGAAA TTTTACCAGA CCAGTGTGGA ATCTACTGAT TTTGCAAATG CTCCAGAAGA 7020* 
AAGTCGAAAG AAGATTAACT CCTGGGTGGA AAGTCAAACG AATGGTAGGA GAGCCACCCA 7080 
TTATAGAAAC ACCTTTGAGA AACCTATGCC AGTGAGCCTT GTGCTTGACA CTGCATGGGG 7140 
GAACAGGTGT GGGGATTGAG ATGGGTTTGC AGGGAGGGCT GAAGAGGGCA CTCCAGATGA 7200 
AGGATTTGTC CAAATGAATA TGAAGAGAGC CTAGGGGAGC CAAGGAGGAA ATCACAGGAA 7260 
GCCAATTAGA TGGAAACACA TCTGGAGAAT TATTTGCTTA TGGCCCTGCA T GACAA TAGC 7320 
TTTGTGGATC CCCTGTCTCC GCTCAGACCT ATTTTGAGAT CATATCCTTT ACTTTAAATC 7380 
AGACTCAAAT TTTTATGATG AATATTTAAT AGAAAACATT AGAAAGCGTC TCTCGTCTCC 7440 
TTTACTAATT GGGAAACAAG CAGCTCTCTG GTAAATCACC CTTTTGTCTC TGAGCTGGAG 7500 
CTGCCTGGAT CACATCTGTA GCCAATGTGT TCTGCAGGGA TTATCACAGC TCTCTTCCCC 7560 
ATCAAGGGCA AAGAGCTTGA CAAAGTCTCC ATTCTACAGA CATCTTTCTT ACCTCCCACC 7620 
TCTCATTACA GGCCAAACTT ACAGCAACTC AACATGAGAG TGAATAGGAA GATACCCCCG 7680 
GAAGTAGTGT CTGACAGCAC AGGACATGCG TTTCATATTA CAGAGCTCAA GTCACTCATC 7740 
CTAAAATGCA ATCAGGGCCT CCTTCCTCTG AATGGGGACC CCGTAGTTAA AAAAAAATAA 7800 
AAGTAGGAAG AGGAGGGAGG GAGAAAGGAA AGACACATGT TGGAAGAGTA GACAAAATCA 7860 
GTTTATCAGT ATTCCAAATC AGATGATTGG AGACATTCAT ACACAGAGAA CGTGAACTCC 7920 
TTCTCTATCA CAAGAAGTGA TGTCTCCATC AAGGGTAACT TTATACGACT GGAGCCTTGA 7980 
AGAAAGCTGC ATCTGGTGAA CCACTGGTCA GTGAGTCTAA CAATTCAAAG ATCAAAGTCA 8040 
GTGAGTCTCA AGCAGGGATT TGGGTCAATA ATTAACGATC AGTCAOGAAC ATTTGCAAAG 8100 
CATCTTCCAG ACAAGCCATT TGTAGCTTGT GTAAAAGACT CTTTTATTCT TTCCCTTGCA 8160 
GAAAAAATTA AAAACCTATT TCCTGATGGG ACTATTGGCA ATGATACGAC ACTGGTTCTT 8220 
GTGAACGCAA TCTATTTCAA AGGGCAGTGG GAGAATAAAT TTAAAAAAGA AAACACTAAA 8280 
GAGGAAAAAT TTTGGCCAAA CAAGGTATTG TCTATATTTT ATTTATATAG TGTAATATGT 8340 
TAATACATGG AATGTTAAAC ATTTCTGATG GAATGTAACA TGATAAGTAA AAAATAAAAA 8400 
TTGTTCATGT CTGTTATTTT GTTGTTTTAC TCTTATAACT TTATTTAGTT AGGAATACCT 8460 
GAAAAACTAT TGTTTCTAAC TCATGGAATT CCTGGGTTAT TTCTTAGAAG AAGAAGGATG 8520 
TGTTGCTATC TCAATAATAT TATCTTTTTT GTCTTGTGTT TCACGTGTTA TTTGTTGGAC 8580 
ACATTGATTT ATTGCAGAAT ACATACAAAT CTGTACAGAT GATGAGGCAA TACAATTCCT 8640 
TTAATTTTGC CTTGCTGGAG GATGTACAGG CCAAGGTCCT GGAAATACCA TACAAAGGCA 8700 
AAGATCTAAG CATGATTGTG CTGCTGCCAA ATGAAATCGA TGGTCTGCAG AAGGTAAGAA 8760 
CTTGCATCTA CAACTCTTCC TTCTACTGCC GGACATTTTT CCAAAGATAC CAAGTTTAAA 8820 
CAAGGTAAAA GCTTATGACC GAGTTGCCTC AAAATGATGA AAAATTCTAA ATGAGGAATG 8880 
ATGACTCACC TTCATATTAC AAATATTTGA GCATAGGGCC TGACACAAAC TGAAAGCTTA 8940 
GTTTTTGTTT GTTTGTTTGT TTTTATTATT ATTATTATAA TACTTTAAGC TTTAGGGTAC 9000 
ATGTGCACAA TGTGCAGGTT AGTTACATAT GTATACATGT GCCATGCTGG TGTGCTGCAC 9060 
CCATTAACTC ATCATTTAGC GTTAGGTATA TCTCCTAATG CTATCCCTCC CCCCTCCCCC 9120 
CACCCCACAA CAGTCCTCAG AGTGTGATGT TACCTTCCTG TGTCCAAGTG TTCTCATTGT 9180 
TCAATTCCCA TCTATGATTT AATTCCATCT ATGGCTTAGT TAATGATTAA TTTATTAGAG 9240 
TTACATGCAT TGGATATCAA TTTGATGATA TTATTATGCA GCAATTTAAA CTTGACTGGG 9300 
AGAAATATAT ACCAATGTGA GGAAAGTTTA CAAATAGGCC GAGTAGAAAA GGGAATACAA 9360 
ATTTAGGAAT TTAGGGAATT ACAATTTAAT AATTGCAATG TGTACTAAAT AATGTATACA 9420 
GAAAAATATG ATGAGCCTAT TAAAAATTGA CACATGTAGT AGGCTGTTGG CACAAGAAAT 9480 
AGTGATACAT ACAGTTCATT GTGTACAAAA TAATGTAATC ATATTTTACA TGTGTATCAT 9540 
ACAGTTGTAT ACATACATAT GTACACATAT ACATATACGT AAAAACATGA TTCTGTTTTT 9600 
ACATACATGT ATATACATAT ACACATATAA CCCAATGTAT TTATATATTC AGGACTCATA 9660 
TTTTACCTAT TAGAATAATA ATGTCTATTA AAGTGAACCT TCTGTATTTC ACATTTATTG 9720 
CCAAAATAAC GAATCTCCAC ATAGTCAATT CATTGTTAAG GTGTATTAGA GATCGACAGT 9780 
TAGTCATATC AGTTTCTTTT TTCCATTTGT ATAGCTTGAA GAGAAACTCA CTGCTGAGAA 9840 
ATTGATGGAA TGGACAAGTT TGCAGAATAT GAGAGAGACA TGTGTCGATT TACACTTACC 9900 
TCGGTTCAAA ATGGAAGAGA GCTATGACCT CAAGGACAOG TTGAGAACCA TGGGAATGGT 9960 
GAATATCTTC AATGGGGATG CAGACCTCTC AGGCATGACC TGGAGCCACG GTCTCTCAGT 10020 
ATCTAAAGTC CTACACAAGG CCTTTGTGGA GGTCACTGAG GAGGGAGTGG AAGCTGCAGC 10080 
TGCCACCGCT GTAGTAGTAG TCGAATTATC ATCTCCTTCA ACTAATGAAG AGTTCTGTTG 10140 
TAATCACCCT TTCCTATTCT TCATAAGGCA AAATAAGACC AACAGCATCC TCTTCTATGG 10200 
CAGATTCTCA TCCCCATAGA TGCAATTAGT CTGTCACTCC ATTTAGAAAA TGTTCACCTA 10260 
GAGGTGTTCT GGTAAACTGA TTGCTGGCAA CAACAGATTC TCTTGGCTCA TATTTCTTTT 10320 
CTATCTCATC TTGATGATGA TAGTCATCAT CAAGAATTTA ATGATTAAAA TAGCATGCCT 10380 
TTCTCTCTTT CTCTTAATAA GCCCACATAT AAATGTACTT TTCCTTCCAG AAAAATTTCC 10440 
CTTGAGGAAA AATGTCCAAG ATAAGATGAA TCATTTAATA CCGTGTCTTC TAAATTTGAA 10500 
ATATAATTCT GTTTCTGACC TGTTTTAAAT GAACCAAACC AAATCATACT TTCTCTTCAA 10560 
ATTTAGCAAC CTAGAAACAC ACATTTCTTT GAATTTAGGT GATACCTAAA TCCTTCTTAT 10620 
GTTTCTAAAT TTTGTGATTC TATAAAACAC ATCATCAATA AAATAATGAC ATAAAATCAT 10680 
TTTTGCTTTA CCTGTTTTCT CTCTGGAAAG GGCAAGTGTC CAGTTACACA TAGGAAAGAT 10740 
AATTTAGAGA TATATTAATC ATATATAAAG GAAAATTAAA AACAGAGTAG TTCATGATGA 10800 
GCCTGGAGTA GAAGGCATAT CCCAGAACAG GAGGAGCCTT GTAAACCACA TAGGAACTTC 10860 
CTATTTTATG CTAAAGGGAT AAGAAACTCA TTACAGGCTT TGATGGTTGT TTGTCAAAGA 10920 
GGGGCATAAA ATTATCATAT CCACATCTAG AAAATACATC TCTGGCTACG CTGATATCAA 10980 
TGGATGCGAG GAAAGAACAG TGTGGTTACC ATATATAAAT TAGGAAATCA TTAGAGTATT 11040 
GGGAGTGGAA ATGGAGAGAA AGAAAGAGCC TGGGGGAATT ATTTAGGAAA TAATAGTTAC 11100 
AGAAAGACAT CTAAGTTGCT GACCTATCTG ACTGGATGGA TGGAAGAATA TCTTGTTTCT 11160 
GAGAGAAAAA AAGACTTTGG GTTTAAATTT GTACTTGATG AATTAAGGTA CTTTTAATAT 11220 
TCAAATGGAT TTGCCTGGCA GGCACTTGAA GATATTAGTC TAAATCTCAG AAACAGAATA 11280 
TGATCTGAAG CTCTAAATTT GTGATATTCA ATATAAATAC TTTAGAGTCA TTGGGATAAA 11340 
TATGGTAGTT GTAGCTAAAA GCAAAAATAA GATACTAGGG AGAAAGGATA AAGTTAGAAG 11400 
AAAGAAGAAT CTAGAATTGA CCTTGAAGTA TATCAGCATG TGTAAAGATC AGGAATTGAT 11460 
CATTTTTATT TTCCAGAAAG TAGCTTTTCT TAGGGTTCCA TATTTACTCC CATAGATTCT 11520 
TCCC 

Seq ID NO: 465 Protein sequence 
Protein Accession it BAB2152S.1 

1 11 21 31 41 51 

I I I I I I 

MNSLSEANTK FMFDLFQQPR KSKENNIFYS PISITSALGM VLLGAKDNTA QQISKVLHPD 60 

QVTENTTEKA ATYHVDRSGN VHHQFQKLLT EFNKSTDAYE LKIANKLFGE KTYQFLQEYL 120 

DAIKKFYQTS VESTDFANAP EESRKKINSW VESQTNEKIK NLFPDGTIGN DTTLVLVNAI 180 

YFKGQWENKF KKENTKEEKF WPNKNTYKSV QMMRQYNSFN FALLEDVQAK VLEIPYKGKD 240 
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LSMIVLLPNB IDGLQKLEEK LTAEKLMEWT SLQNMRETCV DLHLPRFKME ESYDLKDTLR 300 

TKGMVNIFNG DADLSGMTWS HGIiSVSKVLH KAPVBVTEEG VEAAAATAW WELSSPSTN 360 
EEFCCNHPFL FPIRQNKTNS ILFYGRFSSP 

Seq ID NO: 466 DNA sequence 

Nucleic Acid Accession ft : NM_001910 . 1 

Coding sequence * 50.. 1240 

1 11 21 31 41 51 

| | I I I I 

GGAGAGAAGA AAGGAGGGGG CAAGGGAGAA GCTGCTGGTC GGACTCACAA TGAAAACGCT 60 

CCTTCTTTTG CTGCTGGTGC TCCTGGAGCT GGGAGAGGCC CAAGGATCCC TTCACAGGGT 120 

GCCCCTCAGG AGGCATCCGT CCCTCAAGAA GAAGCTGCGG GCACGGAGCC AGCTCTCTGA 180 

GTTCTGGAAA TCCCATAATT TGGACATGAT CCAGTTCACC GAGTCCTGCT CAATGGACCA 240 

GAGTGCCAAG GAACCCCTCA TCAACTACTT GGATATGGAA TACTTCGGCA CTATCTCCAT 300 

TGGCTCCCCA CCACAGAACT TCACTGTCAT CTTCGACACT GGCTCCTCCA ACCTCTGGGT 360 

CCCCTCTGTG TACTGCACTA GCCCAGCCTG CAAGACGCAC AGCAGGTTCC AGCCTTCCCA 420 

GTCCAGCACA TACAGCCAGC CAGGTCAATC TTTCTCCATT CAGTATGGAA CCGGGAGCTT 480 

GTCCGGGATC ATTGGAGCCG ACCAAGTCTC TGTGGAAGGA CTAACCGTGG TTGGCCAGCA 540 

GTTTGGAGAA AGTGTCACAG AGCCAGGCCA GACCTTTGTG GATGCAGAGT TTGATGGAAT 600 

TCTGGGCCTG GGATACCCCT CCTTGGCTGT GGGAGGAGTG ACTCCAGTAT TTGACAACAT 660 

GATGGCTCAG AACCTGGTGG ACTTGCCGAT GTTTTCTGTC TACATGAGCA GTAACCCAGA 720 

AGGTGGTGCG GGGAGCGAGC TGATTTTTGG AGGCTACGAC CACTCCCATT TCTCTGGGAG 780 

CCTGAATTGG GTCCCAGTCA CCAAGCAAGC TTACTGGCAG ATTGCACTGG ATAACATCCA 840 

GGTGGGAGGC ACTGTTATGT TCTGCTCCGA GGGCTGCCAG GCCATTGTGG ACACAGGGAC 900 

TTCCCTCATC ACTGGCCCTT CCGACAAGAT TAAGCAGCTG CAAAACGCCA TTGGGGCAGC 960 

CCCCGTGGAT GGAGAATATG CTGTGGAGTG TGCCAACCTT AACGTCATGC CGGATGTCAC 1020 

CTTCACCATT AACGGAGTCC CCTATACCCT CAGCCCAACT GCCTACACCC TACTGGACTT 1080 

CGTGGATGGA ATGCAGTTCT GCAGCAGTGG CTTTCAAGGA CTTGACATCC ACCCTCCAGC 1140 

TGGGCCCCTC TGGATCCTGG GGGATGTCTT CATTCGACAG TTTTACTCAG TCTTTGACCG 1200 

TGGGAATAAC CGTGTGGGAC TGGCCCCAGC AGTCCCCTAA GGAGGGGCCT TGTGTCTGTG 1260 

CCTGCCTGTC TGACAGACCT TGAATATGTT AGGCTGGGGC ATTCTTTACA CCTACAAAAA 1320 

GTTATTTTCC AGAGAATGTA GCTGTTTCCA GGGTTGCAAC TTGAATTAAG ACCAAACAGA 1380 

ACATGAGAAT ACACACACAC ACACACATAT ACACACACAC ACACTTCACA CATACACACC 1440 

ACTCCCACCA CCGTCATGAT GGAGGAATTA CGTTATACAT TCATATTTTG TATTGATTTT 1500 

TGATTATGAA AATCAAAAAT TTTCACATTT GATTATGAAA ATCTCCAAAC ATATGCACAA 1560 

GCAGAGATCA TGGTATAATA AATCCCTTTG CAACTCCACT CAGCCCTGAC AACCCATCCA 1620 

CACACGGCCA GGCCTGTTTA TCTACACTGC TGCCCACTCC TCTCTCCAGC TCCACATGCT 1680 

GTACCTGGAT CATTCTGAAG CAAATTCCGA GCATTACATC ATTTTGTCCA TAAATATTTC 1740 

TAACATCCTT AAATATACAA TCGGAATTCA AGCATCTCCC ATTGTCCCAC AAATGTTTGG 1800 

CTGTTTTTGT AGTTGGATTG TTTGTATTAG GATTCAAGCA AGGCCCATAT ATTGCATTTA 1860 

TTTGAAATGT CTGTAAGTCT CTTTCCATCT ACAGAGTTTA GCACATTTGA ACGTTGCTGG 1920 

TTGAAATCCC GAGGTGTCAT TTGACATGGT TCTCTGAACT TATCTTTCCT ATAAAATGGT 1980 

AGTTAGATCT GGAGGTCTGA TTTTGTGGCA AAAATACTTC CTAGGTGGTG CTGGGTACTT 2040 

CTTGTTGCAT CCTGTCAGGA GGCAGATAAT GCTGGTGCCT CTCTATTGGT AATGTTAAGA 2100 
CTGCTGGGTG GGTTTGGAGT TCTTGGCTTT AATCATTCAT TACAAAGTTC AGCATTTT 

Seq ID NO; 467 Protein sequence 
Protein Accession #: NP_001901.1 

1 11 21 31 41 51 

I I I I I I 

MKTLLLLLLV LLELGEAQGS LHRVPLRRHP SLKKKLRARS QLSEFWKSHN LDMIQFTESC 60 

SMDQSAKEPL INYLDMEYFG TISIGSPPQN FTVIFDTGSS NLWVPSVYCT SPACKTHSRF 120 

QPSQSSTYSQ PGQSFSIQYG TGSLSGIIGA DQVSVEGLTV VGQQFGESVT EPGQTFVDAE 180 

FDGILGLGYP SLAVGGVTPV FDNMMAQNLV DLPMFSVYMS SNPEGGAGSE LI FGGYDHSH 240 

FSGSLNWVPV TKQAYWQIAL DNIQVGGTVM FCSEGCQAIV DTGTSLITGP SDKIKQLQNA 300 

IGAAPVDGEY AVECANIiNVM PDVTFTINGV PYTLSPTAYT LLDFVDGMQF CSSGFQGLDI 360 
HPPAGPLWIL GDVFIRQFYS VFDRGNNRVG LAPAVP 

Seq ID NO: 468 DNA sequence 

Nucleic Acid Accession #: NM_01805B.l 

Coding sequence: 319.. 1575 

1 11 21 31 41 51 

I I I I I I 

TACGCGCTGC GGGACCGGCA GGGGAACGCC ATCGGGGTCA CAGCCTGCGA CATCGACGGG 60 

GACGGCCGGG AGGAGATCTA CTTCCTCAAC ACCAATAATG CCTTCTCGGG GGTGGCCACG 120 

TACACCGACA AGTTGTTCAA GTTCCGCAAT AACCGGTGGG AAGACATCCT GAGCGATGAG 180 

GTCAACGTGG CCCGTGGTGT GGCCAGCCTC TTTGCCGGAC GCTCTGTGGC CTGTGTGGAC 240 

AGAAAGGGCT CTGGACGCTA CTCTATCTAC ATTGCCAATT ACGCCTACGG TAATGTGGGC 300 

CCTGATGCCC TCATTGAAAT GGACCCTGAG GCCAGTGACC TCTCCCGGGG CATTCTGGCG 360 

CTCAGAGATG TGGCTGCTGA GGCTGGGGTC AGCAAATATA CAGGGGGCCG AGGCGTCAGC 420 

GTGGGCCCCA TCCTCAGCAG CAGTGCCTCG GATATCTTCT GCGACAATGA GAATGGGCCT 480 

AACTTCCTTT TCCACAACCG GGGCGATGGC ACCTTTGTGG ACGCTGCGGC CAGTGCTGGT 540 

GTGGACGACC CCCACCAGCA TGGGCGAGGT GTCGCCCTGG CTGACTTCAA CCGTGATGGC 600 

AAAGTGGACA TCGTCTATGG CAACTGGAAT GGCCCCCACC GCCTCTATCT GCAAATGAGC 660 

ACCCATGGGA AGGTCCGCTT CCGGGACATC GCCTCACCCA AGTTCTCCAT GCCCTCCCCT 720 

GTCCGCACGG TCATCACCGC CGACTTTGAC AATGACCAGG AGCTGGAGAT CTTCTTCAAC 780 

AACATTGCCT ACCGCAGCTC CTCAGCCAAC CGCCTCTTCC GCGTCATCCG TAGAGAGCAC 840 

GGAGACCCCC TCATCGAGGA GCTCAATCCC GGCGACGCCT TGGAGCCTGA GGGCCGGGGC 900 

ACAGGGGGTG TGGTGACCGA CTTCGACGGA GACGGGATGC TGGACCTCAT CTTGTCCCAT 960 

GGAGAGTCCA TGGCTCAGCC GCTGTCCGTC TTCCGGGGCA ATCAGGGCTT CAACAACAAC 1020 

TGGCTGCGAG TGGTGCCACG CACCCGGGTT GGGGCCTTTG CCAGGGGAGC TAAGGTCGTG 1080 

CTCTACACCA AGAAGAGTGG GGCCCACCTG AGGATCATCG ACGGGGGCTC AGGCTACCTG 1140 

TGTGAGATGG AGCCCGTGGC ACACTTTGGC CTGGGGAAGG ATGAAGCCAG CAGTGTGGAG 1200 

GTGACGTGGC CAGATGGCAA GATGGTGAGC CGGAACGTGG CCAGCGGGGA GATGAACTCA 1260 
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GTGCTGGAGA TCCTCTAOCC CCGGGATGAG GACACACTTC 
ACACCAATGA ATGCATCCAG TTCCCATTCG TGTGCCCTCG 
ACACCTATGG AAGCTACAGG TGCCGGACCA ACAAGAAGTG 
ACGAGGATGG CACAGCCTGC GTGGGGACTC TCGGCCAGTC 
CCCCCACOGC TGCTGCTGCC ACTGCCGCTG CTGCTGCCGC 
CACCGGTCCT CGTAGATGGA GATCTCAATC TGGGGTCGGT 
CCAGCTGCTG AGCAGGGGTO GGACATGAAC CAGCGGATGG 
AAGTGGGCTT GTGCTGCTGC CTAGACAGTA GGGATGTAAA 
CCCAAGCCCA TCCATGCACA TTACTTAGCT AACAATTAGG 
CTGTGCTGGG CACATAGCTG TGATCACAGC AGACAGGGTC 
ATTCCAGTGG GTCTAATGAC CATATCTTAG GACACAGATG 
CTGCACAGGA AGTATGAGGA CTTTAGTGTC CTGAGTTCAA 
AAAGCTATGT GACCTTACAC CAGTCACTTA ACTTGTTAGC 
AAATGGGGAT TAAGAATAGA ATCTTGGGGT TAGTGTGGAG 
GACACTTGGC ACAAAACCTG GCACATAGTA AAGGCTCAAT 
GGGCTTTGTC AACACGTG 

Seq ID NO: 469 Protein sequence 
Protein Accession #: NP 060528.1 



PCTYUS02/12476 



i 
I 

MDPEASDLSR 
RGDGTFVDAA 
FRDIASPKFS 
ELNPGDALEP 
RTRVGAFARG 
KMVSRNVASG 
GAGPTRSAVG 



11 

I 

GILALRDVAA 
ASAGVDDPHQ 
MPSPVRTVIT 
EGRGTGGWT 
AKWLYTKKS 
EMNSVLEIIiY 
ATSPTRMAQP 



21 
I 

EAGVSKYTGG 
HGRGVALADP 
ADFDNDQELE 
DFDGDGMLDL 
GAHLRI IDGG 
PRDEDTLQDP 
AHGLSASHRA 



31 

I 

RGVSVGPILS 
NRDGKVDIVY 
IPFNNIAYRS 
ILSHGESMAQ 
SGYLCEMEPV 
APtiETPMMAS 
PAPPPPPLLL 



Seq ID NO: 470 DNA sequence 
Nucleic Acid Accession #: AJ2 79016 
Coding sequence: 1..1962 



1 
I 

ATGTCCAGGA 
CAGCGGGCTG 
AGTAATCCCA 
TTTGAGATCG 
CAGAAGCGGC 
GACCGGCAGG 
GAGATCTACT 
TTGTTCAAGT 
CGTGGTGTGG 
GGACGCTACT 
ATTGAAATGG 
GCTGCTGAGG 
CTCAGCAGCA 
CACAACCGGG 
CACCAGCATG 
GTCTATGGCA 
GTCCGCTTCC 
ATCACCGCCG 
CGCAGCTCCT 
ATCGAGGAGC 
GTGACCGACT 
GCTCAGCCGC 
GTGCCACGCA 
AAGAGTGGGG 
CCCGTGGCAC 
GATGGCAAGA 
CTCTACCCCC 
TTCTCCCAGC 
GTGTGCCCTC 
AACAAGAAGT 
CTCGGCCAGT 
GCTGCTGCCG 
CTGGGGTCGG 
CCAGCGGATG 
AGGGATGTAA 
TAACAATTAG 
CAGACAGGGT 
GGACACAGAT 
CCTGAGTTCA 
AACTTGTTAG 
TTAGTGTGGA 
AAAGGCTCAA 



11 
I 

TGTTACCGTT 
AACCCATGTT 
CCCAGCTCAA 
TCGTGGCGGG 
TGGTGAACAT 
GGAACGCCAT 
TCCTCAACAC 
TCCGCAATAA 
CCAGCCTCTT 
CTATCTACAT 
ACCCTGAGGC 
CTGGGGTCAG 
GTGCCTCGGA 
GCGATGGCAC 
GGCGAGGTGT 
ACTGGAATGG 
GGGACATCGC 
ACTTTGACAA 
CAGCCAACCG 
TCAATCCCGG 
TCGACGGAGA 
TGTCCGTCTT 
CCCGGTTTGG 
CCCACCTGAG 
ACTTTGGCCT 
TGGTGAGCCG 
GGGATGAGGA 
AGGAAAATGG 
GAGACAAGCC 
GCAGTCGGGG 
CACOGGGCCC 
CTGCTGGAGC 
TGGTTAAGGA 
GAGTCCAGCA 
AGGCCTGGGA 
GGAGACTCGT 
CGCTGCCCTG 
GTGCCCAGGG 
AATCCTGATT 
CCATCCATTA 
GATTAGATTA 
TAAAAACAAG 



21 
I 

CCTGCTGCTG 
CACTGCAGTC 
CTATGGTGTG 
GTACAATGGA 
CGCGGTCGAT 
CGGGGTCACA 
CAATAATGCC 
CCGGTGGGAA 
TGCCGGACGC 
TGCCAATTAC 
CAGTGACCTC 
CAAATATACA 
TATCTTCTGC 
CTTTGTGGAC 
CGCCCTGGCT 
CCCCCACCGC 
CTCACCCAAG 
TGACCAGGAG 
CCTCTTCCGC 
CGACGCCTTG 
CGGGATGCTG 
CCGGGGCAAT 
GGCCTTTGCC 
GATCATCGAC 
GGGGAAGGAT 
GAACGTGGCC 
CACACTTCAG 
CCATTGCATG 
CGTATGTGTC 
CTACGAGCCC 
CCGCCCCACC 
TGCCACTGCT 
GAGCTGCGAG 
GGGGAGTGGG 
GCTAGACCCT 
AAGGCCAGGC 
ATGGCGCTTA 
AGGTGGTGTC 
CAGGAACTCA 
TCGCATCTGC 
AATGTATGTA 
TGCCTCTCAC 



Seq ID NO: 471 Protein sequence 
Protein Accession #: CAC08451 

1 11 



31 
I 

CTCTGGTTTC 
ACCAACTCAG 
GCAGTTACTG 
CCCAACCTGG 
GAGCGCAGCT 
GCCTGCGACA 
TTCTCGGGGG 
GACATCCTGA 
TCTGTGGCCT 
GCCTACGGTA 
TCCCGGGGCA 
GGGGGCCGAG 
GACAATGAGA 
GCTGCGGCCA 
GACTTCAACC 
CTCTATCTGC 
TTCTCCATGC 
CTGGAGATCT 
GTCATCCGTA 
GAGCCTGAGG 
GACCTCATCT 
CAGGGCTTCA 
AGGGGAGCTA 
GGGGGCTCAG 
GAAGCCAGCA 
AGCGGGGAGA 
GACCCAGCCC 
GACACCAATG 
AACACCTATG 
AACGAGGATG 
ACCCCCACCG 
GCACCGGTCC 
CCCAGCTGCT 
AAAGTGGGCT 
CCCCAAGCCC 
CCTGTGCTGG 
CATTCCAGTG 
ACTGCACAGG 
CAAAGCTATG 
AAAATGGGGA 
AGACACTTGG 
TGGGCTTTGT 



AGGACCCAGC 
AGACAAGCCC 
CAGTCGGGGC 
ACCGGGCCCC 
TGCTGGAGCT 
GGTTAAGGAG 
AGTCCAGCAG 
GGCCTGGGAG 
GAGACTCGTA 
GCTGCCCTGA 
TGCCCAGGGA 
ATCCTGATTC 
CATCCATTAT 
ATTAGATTAA 
AAAAACAAGT 



41 
I 

SSASDIFCDN 
GNWNGPHRLY 
SSANRLFRVI 
PLSVFRGNQG 
AHFGLGKDEA 
SSHSCALETS 
PLPLLLPLLE 



41 
I 

TGCCCATCAC 
TTCTGCCTCC 
ATGTGGACCA 
TTCTGAAGTA 
CACCCTACTA 
TCGACGGGGA 
TGGCCACGTA 
GCGATGAGGT 
GTGTGGACAG 
ATGTGGGCCC 
TTCTGGCGCT 
GCGTCAGCGT 
ATGGGCCTAA 
GTGCTGGTGT 

gtgatggcaa 
aaatgagcac 
cctcccctgt 

TCTTCAACAA 
GAGAGCACGG 
GCCGGGGCAC 
TGTCCCATGG 
ACAACAACTG 
AGGTCGTGCT 
GCTACCTGTG 
GTGTGGAGGT 
TGAACTCAGT 
CACTGGAGTG 
AATGCATCCA 
GAAGCTACAG 
GCACAGCCTG 
CTGCTGCTGC 
TCGTAGATGG 
GAGCAGGGGT 
TGTGCTGCTG 
ATCCATGCAC 
GCACATAGCT 
GGTCTAATGA 
AAGTATGAGG 
TGACCTTACA 
TTAAGAATAG 
CACAAAACCT 
CAACACG 



CCCACTGGAG 
GTATGTGTCA 
TACGAGCCCA 
CGCCCCACCA 
GCCACTGCTG 
AGCTGCGAGC 
GGGAGTGGGA 
CTAGACCCTC 
AGGCCAGGCC 
TGGCGCTTAC 
GGTGGTGTCA 
AGGAACTCAC 
CGCATCTGCA 
ATGTATGTAA 
GCCTCTCACT 



51 
I 

ENGPNFLFHN 
LQMSTHGKVR 
RREHGDPLIE 
FNNNWLRWP 
SSVEVTWPDG 
PYVSTPMEAT 
LPLLHRSS 



51 
I 

TGAGGGGTCC 
TGACTATGAC 
TGATGGGGAC 
TGACCGGGCC 
CGCGCTGCGG 
CGGCCGGGAG 
CACCGACAAG 
CAACGTGGCC 
AAAGGGCTCT 
TGATGCCCTC 
CAGAGATGTG 
GGGCCCCATC 
CTTCCTTTTC 
GGACGACCCC 
AGTGGACATC 
CCATGGGAAG 
CCGCACGGTC 
CATTGCCTAC 
AGACCCCCTC 
AGGGGGTGTG 
AGAGTCCATG 
GCTGCGAGTG 
CTACACCAAG 
TGAGATGGAG 
GACGTGGCCA 
GCTGGAGATC 
TGGCCAAGGA 
GTTCCCATTC 
GTGCCGGACC 
CGTGGGGACT 
CACTGCCGCT 
AGATCTCAAT 
GGGACATGAA 
CCTAGACAGT 
ATTACTTAGC 
GTGATCACAG 
CCATATCTTA 
ACTTTAGTGT 
CCAGTCACTT 
AATCTTGGGG 
GGCACATAGT 



1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 



60 
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60 
120 
180 
240 
300 
360 
420 
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540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
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1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 



51 



21 31 41 

III! 

MSRMIiPFIiLIi LWFLPITEGS QRAEPMFTAV TNSVLPPDYD SNPTQLNYGV AVTDVDHDGD 
FEIWAGYNG PNLVLKYDRA QKRLVNIAVD ERSSPYYALR DRQGNAIGVT ACDIDGDGRE 
EIYFLNTNNA FSGVATYTDK LPKPRNNRWE DILSDEVNVA RGVASLFAGR SVACVDRKGS 



60 
120 
180 
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GRYSIYIANY AYGNVGPDAL IEMDPEASDL SRGILALRDV AAEAGVSKYT GGRGVSVGPI 240 

LSSSASDIFC DNBNGPNFLF HNRGDGTPVD AAASAGVDDP HQHGRGVALA DFNRDGKVDI 300 

VYGNWNGPHR LYLQMSTHGK VRFRDIASPK FSMPSPVRTV ITADFDNDQE LEIFFNNIAY 360 

RSSSANRLFR VIRRBHGDPL IEELNPGDAL EPEGRGTGGV VTDFDGDGML DLILSHGESM 420 

AQPLSVFRGN QGFNNNWLRV VPRTRPGAFA RGAKWLYTK KSGAHLRIID GGSGYLCEME 480 

PVAHFGLGKD EASSVEVTWP DGKMVSRNVA SGEMNSVLBI LYPRDEDTLQ DPAPLECGQG 540 

FSQQENGHCM DTNECIQFPF VCPRDKPVCV NTYGSYRCRT NKKCSRGYEP NEDGTACVGT 600 
LGQSPGPRPT TPTAAAATAA AAAAAGAATA APVLVDGDLN LGSWKESCE PSC 

Seg XD NO: 472 DMA sequence 
Nucleic Acid Accession Jh FGENESHH 
Coding sequence: 1. .4794 

1 11 21 31 41 51 

I I I I I I 

ATGGCGTGTC CGGGAGGACT CCCAGCCCGT TGCTCTGGTT GGATGGGACT GGGTGGGCCC 60 

AGCGGCTCCT CCCCAGCATC CCCTCCCCAT TCCTCCTCCA GGTACAATGG ACCCAACCTG 120 

GTTCTGAAGT ATGACCGGGC CCAGAAGCGG CTGGTGAACA TCGCGGTCGA TGAGCGCAGC 180 

TCACCCTACT ACGCGCTGCG GGACCGGCAG GGGAAOGCCA TCGGGGTCAC AGCCTGCGAC 240 

ATCGACGGGG ACGGCCGGGA GGAGATCTAC TTCCTCAACA CCAATAATGC CTTCTCGGGC 300 

CACAGCAGCT CAGCGCAGGT CCCTTCTGGG CTCCACAGAA ACAGGCCTGT GCTGAAGCCT 360 

CCACCTACAA CCCCTGCAGG CCTCCTGGGT CTGCCTCCAC TCAGCGGAAG GGACTTTTCC 420 

TCCTCCCTGG GTCAGGCTTC TCCGGACAGC AGGCAGGGAG AGAGGGTGCC GGTTCCCTGC 480 

TGTCGGGGTG GACTGAGACC TACCCATGAA CCAGAACCAT TTCTTCTGAG ACCCAAATCA 540 

GGGGTGGCCA CGTACACCGA CAAGTTGTTC AAGTTCCGCA ATAACCGGTG GGAAGACATC 600 

CTGAGCGATG AGGTCAAOGT GGCCCGTGGT GTGGCCAGCC TCTTTGCCGG ACGCTCTGTG 660 

GCCTGTGTGG ACAGAAAGGG CTCTGGACGC TACTCTATCT ACATTGCCAA TTACGCCTAC 720 

GGTAATGTGG GCCCTGATGC CCTCATTGAA ATGGACCCTG AGGCCAGTGA CCTCTCCCGG 780* 

GGCATTCTGG CGCTCAGAGA TGTGGCTGCT GAGGCTGGGG TCAGCAAATA TACAGAAGGC 840 

TTCTCCCACA CTGCCTCTCC AAGCATTGGT GAGATATCTG GCAGAACCGA GGAGCGGGAA 900 

GGAGGAGACC CAGAGGAGGC AGATGAGGAG CACAGTGGGG ATGGAAGCAC CAGCCAACTG 960 

TGCCGGCTGG GCTGGAAGGA CGGGCAGTTC AAGGAAGAAG CAGCAGCTTT GGTGGAGGAA 1020 

CAGAGGGAGG CTGGGGCAGC TGGCGTGCCC AGAGGACGTG TTCGAACAGC TCTGCAGACT 1080 

TCCAAAAGCC ATTTGGCTGA CAAGAACCTA TTTGGCCCAC CATGTTACTA TTCTGTCTGC 1140 

GCGCCTTCTC CAGCCCACCC TTTCCCTGCC CGCCAAGCCC CCCAACACTA CCCTGTAGCC 1200 

CCCCTTGTCA CTCAGCTAAT GACACATGGA CGTCTGGCTG GAAAACTAGC CCGGAGTGTC 1260 

CCCCACCCCC GAGCCCCAGG AATGGACCCC AAATGTAAGG GCCGCCATGC TGAGCCCGGC 1320 

CTGATGGCTG AGGCTTTGGG CGCGTGGCCA GCGCTCAGCA CCACTGTGGT GCCAGGGGGC 1380 

CTGAGAAGCT GGGAGGAAAG CAGGCAGAAG GGGCAGGCCA TGTCCAGATG TGCACTCAGG 1440 

GAGCTGGGAG GTCCCTGGAG CCAAGCCACA CAGCACCTGC CTGCTAGAGA GCTGTATGAC 1500 

CTGGGAGAAC CTCCCATTTT ACAAAGAACA GACGGAGATC CAGGGAGGAG AAGGGACTCG 1560 

CCCAAGGTCA CACAGGAGTG CCATCTAGTG GCCACCATGC CAGCTCTCGG GGGACTCGAG 1620 

GGCCCCGGGA GGGTGGCCAA GCGAGAGATT GGGAGAGAGA CTGGGGCAGT AGGAAGACCA 1680 

CTCTCCCATC CCCTGGTCCC CAACTTCCCC AGCTGCTTGA GGCCTCTTGA AGCCGGGACA 1740 

GTGCCGGGAG CTGCCCTGCC TGGGAATCCT GGGAACTGGG TTCTGGACAT GGCCAAGGCC 1800 

CTGGCGTGGA ACCAGATGGA AAAAGAGGAG GGGAAGATTC ATGGAGACCA TGAGCCCAGA I860 

TTTAGGCTCA GGAAAGCACG GGAAGCAGAA TTCCCCCCAG GCTCCTCTGA GGAGCCTCTG 1920 

CTGCAGTTCC CCTCAGGCCT CAGAGGCAGC CCTGTCCTCC AGGTGGGCCT GGGGCTTGCT 1980 

TCTGCCACTC ACTGTGGGTC GATGTCTTTT CTAGGGGGCC GAGGCGTCAG CGTGGGCCCC 2040 

ATCCTCAGCA GCAGTGCCTC GGATATCTTC TGCGACAATG AGAATGGGCC TAACTTCCTT 2100 

TTCCACAACC GGGGCGATGG CACCTTTGTG GACGCTGCGG CCAGTGCTGA ACGTCGTTTA 2160 

GCCTTCATCG TTCACCTCAA ATATCACCTC TGCAGAGATT TTCCTCACTC CCTGTGCCAC 2220 

CTAGCAGAAA CTGGTCCTTC CTCCTCCTGC TGCCCGTGGC ATGCACGTCT TCTTCAGGCT 2280 

CCACATTGCC ATCATGGTTT GTCTATGAGC TTTACAAGGA CCGGGTCACG GTTCTATTCA 2340 

TTCTTGACGC AAGGCTTGGC CTCCAGTGCC CACCGGAGGA CACTCAGCCT CCAGGGTTCT 2400 

CAGGGGGCCC CACCCTGCCT TCTGGCAAGA GCTCCCTGTG TCCTGGGGTC TCTGATCCCC 2460 

ACTGCCTATT ACATTGTCCT GTGGTCTGCC ATCCCAGAGA GCCTGATGAC CCACAGCTAT 2520 

TTGTCCTCTG AAAGAGTCAA CGTGGGTGTG GACGACCCCC ACCAGCATGG GCGAGGTGTC 2580 

GCCCTGGCTG ACTTCAACCG TGATGGCAAA GTGGACATCG TCTATGGCAA CTGGAATGGC 2640 

CCCCACCGCC TCTATCTGCA AATGAGCACC CATGGGAAGG TCCGCTTCCG GGACATCGCC 2700 

TCACCCAAGT TCTCCATGCC CTCCCCTGTC CGCACGGTCA TCACCGCCGA CTTTGACAAT 2760 

GACCAGGAGC TGGAGATCTT CTTCAACAAC ATTGCCTACC GCAGCTCCTC AGCCAACCGC 2820 

CTCTTCCGAT GCTCCATCCT GGCTCGTGGC TCTTCATCCT TGACAGCTGG TGGGAGGAAC 2880 

GGTCAGGGAG AAGGTTTAAG AATCAGAAGG GGAGGGTTCC CAGGGCCAGG GGGTCAGGCC 2940 

AAGGTCAACA CAGGTCCCCT GATGAAGAAA CAGAAAGGAA GGAAGGACGA GGACTGGGCA 3000 

AGAGGCTGTG GGAATGCAGG GCAAAGCCTG GCCAAGGAGC CGGCCTCTGC TATTGCAGGG 3060 

AAAGGGAAGG GAAATGTGGC CCAAAGTGTG CCCAGAACCC AAGCGCCACA AGATACAAAG 3120 

CCACACTACC ACAAAAAGGG GCTACAGGGT CCAATCACTA CCAGGAAAAG GGGCTACGGG 3180 

GTCCAATCAC TACCAGGAAA AGGGGCTACG GGGTCCAATC ACTACCAGGA AAAGGGGCTA 3240 

CGGGGTCCAA TCACTACCAG GAAAAGGGGC TACGGGGTCC AATCACTACC AGGAAAAGGG 3300 

GCTACGGGCT CCAATCACTA CCAGGAAAAG GGGCTACAGG GTCCAATCAC TACCAGGAAA 3360 

AGGGGCTACG GGCTCCAATC ACTACCAGGA AAAGGGGCTA CAGGGTCCAA TCACTACCAC 3420 

AGAAAGGGGC TACGGGCTCC AATCACTACC AGGAAAAGGG GCTACGGGGT CCAATCACTA 3480 

CCAGGAAAAG GGGCTACAGG GTCCAATCAC TACCAGGAAA AGGGGCTACG GGGTCCAATC 3540 

ACTACCAGGA AAAGGGGCTA CGGGCTCCAA TCACTACCAG GAAAAGGGGC TACGGGGTCC 3600 

AATCACTACC AGGAAAAGGG GCTACAGGGT CCAATCACTA CCAGGAAAAG GGGCTACAGG 3660 

GTCCAATCAC TACCACAGAA AGGGGCTACG GGCTCCAATC ACTACCAGGA AAAGGGGCTA 3720 

CGGGGTCCAA TCACTACCAG GAAAAGGGGC TACGGGCTCC AATCACTACC AGGAAAAGAG 3780 

GCTATGGGGT CCAATCACTA CCAGGAAAAG GGGCTACGGG CTCCAATCAC TACCAGGAAA 3840 

AGGGGCTATG GGGTCCAATC ACTACCACAG AAAGGGGCTA CGGGGTCCAA CGTCATCCGT 3900 

AGAGAGCACG GAGACCCCCT CATCGAGGAG CTCAATCCCG GCGACGCCTT GGAGCCTGAG 3960 

GGCCGGGGCA CAGGGGGTGT GGTGACCGAC TTCGAOGGAG ACGGGATGCT GGACCTCATC 4020 

TTGTCCCATG GAGAGTCCAT GGCTCAGCCG CTGTCCGTCT TCCGGGGCAA TCAGGGCTTC 4080 

AACAACAACT GGCTGCGAGT GGTGCCACGC ACCCGGTTTG GGGCCTTTGC CAGGGGAGCT 4140 

AAGGTCGTGC TCTACACCAA GAAGAGTGGG GCCCACCTGA GGATCATCGA CGGGGGCTCA 4200 

GGCTACCTGT GTGAGATGGA GCCCGTGGCA CACTTTGGCC TGGGGAAGGA TGAAGCCAGC 4260 

AGTGTGGAGG TGACGTGGCC AGATGGCAAG ATGGTGAGCC GGAACGTGGC CAGCGGGGAG 4320 
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ATGAACTCAG TGCTGGAGAT CCTCTACCCC CGGGATGAGG 
CCACTGGAGT GT66CCAAGG ATTCTCCCAG CAGGAAAATG 
GAATGCATCC AGTTCCCATT CGTGTGCCCT CGAGACAAGC 
GGAAGCTACA GGTGCCGGAC CAACAAGAAG TGCAGTCGGG 
GGCACAGCCT GCGTGGGTAC TGAGCTAGGC TCTAGGCATA 
CCCAAAAAGG AGCTGCAACT TTCCCAAGGC ATCTGCACCC 
CCGGGTTGCC GGCTGCTCCT CAAAAGAGCT CAGCTCCAGG 
CAGAAAGCTC CAGGTATTCC AGAAGCCCAA GTGTATGAAC 

Seq ID NO: 473 Protein sequence 
Protein Accession #: FGENBSH predicted 



MACPGGLPAR 
SPYYALRDRQ 
PPTTPAGLLG 
GVATYTDKLP 
GNVGPDAItlE 
GGDPEEADEE 
SKSHLADKNL 
PHPRAPGMDP 
ELGGPWSQAT 
GPGRVAKREI 
LAWNQMEKEE 
SATHCGSMSF 
AFIVHLXYHL 
FLTQGLASSA 
LSSERVNVGV 
SPKFSMPSPV 
GQGEGLRIRR 
KGKGNVAQSV 
RGPITTRKRG 
RKGLRAPITT 
NHYQEKGLQG 
AMGSNHYQEK 
GRGTGGWTD 
KWLYTKKSG 
MNSVLEILYP 
GSYRCRTNKK 
PGCRLLLKRA 



11 
I 

CSGWMGLGGP 
GNAIGVTACD 
LPPLSGRDFS 
KFRNNRKEDI 
MDPEASDLSR 
HSGBGSTSQL 
FGPPCYYSVC 
KCKGRHAEPG 
QHLPARELYD 
GRETGAVGRP 
GKIHGDHEPR 
LGGRGVSVGP 
CRDFPHSLCH 
HRRTIjSLQGS 
DDPHQHGRGV 
RTVITADFDN 
GGFPGPGGQA 
PRTQAPQDTK 
YGVQSLPGKG 
RKRGYGVQSL 
PITTRKRGYR 
GLRAPITTRK 
FDGDGMLDLI 
AHLRIIDGGS 
RDEDTLQDPA 
CSRGYEPNED 
QLQAAPSTLL 



21 
I 

SGSSPASPPH 
IDGDGREEIY 
SSLGQASPDS 
LSDEVNVARG 
GILALRDVAA 
CRLGWKDGQF 
APSPAHPFPA 
LMAEALGAWP 
LGEPPILQRT 
LSHPLVPNFP 
FRLRKAREAE 
ILSSSASDIF 
IiAETGPSSSC 
QGAPPCLLAR 
ALADFNRDGK 
DQELEI FFNN 
KVNTGPLMKK 
PHYHKKGLQG 
ATGSNHYQEK 
PGKGATGSNH 
VQSLPQKGAT 
RGYGVQSLPQ 
LSHGESMAQP 
GYLCEMEPVA 
PLECGQGFSQ 
GTACVGTELG 
QKAPGIPEAQ 



31 
I 

SSSRYNGPNL 
FLNTNNAFSG 
RQGERVPVPC 
VASLFAGRSV 
EAGVSKYTEG 
KEEAAALVEE 
RQAPQHYPVA 
ALSTTWPGG 



SCLRPLEAGT 
FPPGSSEEPL 
CDNENGPNFL 
CPWHARLLQA 
APCVLGSLIP 
VDIVYGNWNG 
IAYRSSSANR 
QKGRKDEDWA 
PITTRKRGYG 
GLQGPITTRK 
YQEKGLRGPI 
GSNHYQEKGL 
KGATGSNVIR 
LSVFRGNQGF 
HFGLGKDEAS 
QENGHCMDTN 
SRHTMTWKPR 
VYEQDQE 



ACACACTTCA 
GCCATTGCAT 
CCGTATGTGT 
GCTACGAGCC 
CAATGACGTG 
CCGTCTGGTC 
CTGCTCCCAG 
AAGATCAGGA 



41 
I 

VLKYDRAQKR 
HSSSAQVPSG 
CRGGLRPTHE 
ACVDRKGSGR 
FSHTASPSIG 
QREAGAAGVP 
PLVTQLMTHG 
LRSWEESRQK 
PKVTQECHLV 
VPGAALPGNP 
LQFPSGLRGS 
FHNRGDGTFV 
PHCHHGLSMS 
TAYYIVLWSA 
PHRIjYLQMST 
IiFRCSILARG 
RGCGNAGQSL 
VQSLPGKGAT 
RGYGLQSLPG 
TTRKRGYGLQ 
RGPITTRKRG 
REHGDPLIEE 
NNNWLRWPR 
SVEVTWPDGK 
ECIQFPFVCP 
PKKELQLSQG 



GGACCCAGCC 
GGACACCAAT 
CAACACCTAT 
CAACGAGGAT 
GAAACCAAGG 
CTTTTTCCTG 
CACCCTTCTC 
ATAA 



51 
I 

IiVNIAVDERS 
LHRNRPVLKP 
PEPFLLRPKS 
YSIYIANYAY 
EISGRTEERE 
RGRVRTALQT 
RLAGKLARSV 
GQAMSRCAXjR 
ATMPALGGLE 
GNWVLDMAKA 
PVLQVGLGLA 
DAAASAERRL 
FTRTGSRFYS 
-I PESLMTHSY 
HGKVRFRDIA 
SSSLTAGGRN 
AKEPASAIAG 
GSNHYQEKGL 
KGATGSNHYH 
SLPGKGATGS 
YGLQSLPGKE 
LNPGDALEPE 
TRFGAFARGA 
MVSRNVASGE 
RDKPVCVNTY 
ICTPWSFFL 



Seq ID NO: 474 DNA sequence 

Nucleic Acid Accession #: NM_003661.1 

Coding sequence: 1..1152 



1 
I 

ATGAGTGCAC 
CAAAACGTTC 
GCTGCTGGCA 
AAGGAAAAAG 
GGATTCGTGG 
GACAACCTTG 
TACAGAAACT 
AGAAGGCTCC 
AATGTGGTGT 
CTGGCACCCT 
ATCACAGCCG 
ACACAAGCCC 
GAGTTTTTGG 
ACACGAGGCA 
GTACCGCATG 
GAACAGGTGG 
ACGGATGTGG 
TCAAAGCACT 
CAGGAGCTGG 
CAAGAACTGT 



11 
I 

TTTTCCTTGG 
CAAGTGGGAC 
CCATGGACCC 
TGAGCACACA 
CTGCTGCTGA 
CAAGACAAAT 
GGTTTCTGAA 
GTGCCCTTGC 
CTGGCTCTCT 
TCACAGAGGG 
CTTTGACCGG 
AAGCCCACGA 
GTGAGAACAT 
TTGGGAAGGA 
CCTCAGCCTC 
AGAGGGTTAA 
CCCCTGTAAG 
TACATGAGGG 
AGGAGAAGCT 
GA 



21 

I 

TGTGGGAGTG 
AGATACTGGA 
AGAGAGCAGT 
GAATCTGCTA 
ACTGCCCAGG 
GATCATGAAA 
AGAGTTTCCT 
AGATGGGGTT 
CAGCATTTCC 
AGGCAGCCTT 
GATTACCAGC 
CCTGGTCATC 
ATCCAACTTT 
CATCCGTGCC 
ACGCCCCCGG 
TGAACCCAGC 
CTTCTTTCTT 
GGCAAAGTCA 
AAACATTCTC 



31 
i 

AGGGCAGAGG 
GATCCTCAAA 
ATCTTTATTG 
CTCCTGCTGA 
AATGAGGCAG 
GACAAAAACT 
CGGTTGAAAA 
CAGAAGGTCC 
TCTGGCATCC 
GTACTCTTGG 
AGTACCATGG 
AAAAGCCTTG 
CTTTCCTTAG 
CTCAGACGAG 
GTCACTGAGC 
ATCCTGGAAA 
GTGCTGGATG 
GAGACAGCTG 
AACAATAATT 



41 

I 

AAGCTGGAGC 
GTAAGCCCCT 
AGGATGCCAT 
CTGATAATGA 
ATGAGCTCCG 
GGCACGATAA 
GTGAGCTTGA 
ACAAAGGCAC 
TGACCCTCGT 
AACCTGGGAT 
ACTACGGAAA 
ACAAATTGAA 
CTGGCAATAC 
CCAGAGCCAA 
CAATCTCAGC 
TGAGCAGAGG 
TAGTCTACCT 
AGGAGCTGAA 
ATAAGATTCT 



51 
I 

GAGGGTGCAA 
CGGTGACTGG 
TAAGTATTTC 
GGCCTGGAAC 
TAAAGCTCTG 
AGGCCAGCAG 
GGATAACATA 
CACCATCGCC 
CGGCATGGGT 
GGAGTTGGGA 
GAAGTGGTGG 
GGAGGTGAGG 
TTACCAACTC 
TCTTCAGTCA 
TGAAAGCGGT 
AGTCAAGCTC 
CGTGTACGAA 
GAAGGTGGCT 
GCAGGCGGAC 



Seq ID NO: 475 Protein sequence 
Protein Accession flt NP_003652.l 



MSALFLGVGV 
KEKVSTQNLL 
YRNWFLKEFP 
LAPFTEGGSL 
EFLGENISNF 
EQVERVNEPS 
QELEEKLNIL 



11 
I 

RAEEAGARVQ 
LLLTDNEAWN 
RLKSELEDNI 
VLLEPGMELG 
LSLAGNTYQL 
ILEMSRGVKL 
NNNYKILQAD 



21 
I 

QNVPSGTDTG 
GFVAAAELPR 
RRLRAIiADGV 
ITAALTGITS 
TRGIGKDIRA 
TDVAPVSFFL 
QEL 



31 



I 

DPQSKPLGDW 
NEADELRKAL 
QKVHKGTTIA 
STMDYGKKWW 
LRRARANLQS 
VLDWYLVYE 



41 
I 

AAGTMDPESS 
DNLARQMIMK 
NWSGSLSIS 
TQAQAHDLVI 
VPHASASRPR 
SKHLHEGAKS 



51 
I 

IFIEDAIKYF 
DKNWHDKGQQ 
SGILTLVGMG 
KSLDKLKEVR 
VTEPISAESG 
ETAEELKKVA 



4380 
4440 
4500 
4560 
4620 
4680 
4740 
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60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1360 
1440 
1500 
1560 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



60 
120 
160 
240 
300 
360 



Seq ID NO: 476 DNA sequence 

Nucleic Acid Accession 8: NM_014452. 

Coding sequence: 1..1968 



11 



21 



31 



41 



51 



368 



WO 02/086443 
I I I 

ATGGGGACCT CTCCGAGCAG CAGCACCGCC 
GCCACAGCCA CGATGATCGC GGGCTCCCTT 
GCTCAGCCAG AACAGAAGGC CTCGAATCTC 
ACCGGCCAGG TGCTAACCTG TGACAAGTGT 
ACCAACACAA GCCTGCGCGT CTGCAGCAGT 
AATGGCATAG AGAAATGCCA TGACTGTAGT 
TTACCTTGTG CTGCCTTGAC TGACCGAGAA 
AACGCTACCT GTGCCCCCCA TACGGTGTGT 
ACAGAGACTG AGGATGTGCG GTGTAAGCAG 
TCTAGTGTGA TGAAATGCAA AGCATACACA 
AAGCCGGGGA CCAAGGAGAC AGACAACGTC 
ACCTCACCTT CCCCTGGCAC AGCCATCTTT 
GTCCCTTCCT CCACTTATGT TCCCAAAGGC 
TCTGTTAGAC CAAAGGTACT GAGTAGCATC 
TCAGCAAGGG GGAAGGAAGA CGTGAACAAG 
CAGCAAGGCC CCCACCACAG ACACATCCTG 
GGCGAGAAGT CCAGCACGCC CATCAAGGGC 
CACAAGCATT TTGACATCAA TGAGCATTTG 
GTGCTTGTGG TGATTGTGGT GTGCAGTATC 
CCCCGGCAGG ATCCCAGTGC CATTGTGGAA 
ACCCAGAACC GGGAGAAATG GATCTACTAC 
CTTGTAGCAG CCCAAGTGGG AAGCCAGTGG 
AGTGAGAGGG AGGTTGCTGC TTTCTCCAAT 
GCAGCTCTGC AGCACTGGAC CATCCGGGGC 
GCCCTGCGCC AGCACCGGAG AAACGATGTT 
ACCACCCAGC TGGAAACTGA CAAACTAGCT 
AGCCCCATCC CCAGCCCCAA CGCGAAACTT 
TCCCCACAGG ACAAGAACAA GGGCTTCTTC 
GACTCTACAT CCAGCGGCTC CTCCGCGCTG 
AAGAAGGACA CAGTGTTGCG GCAGGTACGC 
GATGACATGC TCCACTTTCT AAATCCTGAG 
GCTGAGGACA AACTAGACCG GCTATTCGAA 
CAGACCCTCC TGGACTCTGT TTATAGCCAT 



CTCGCCTCCT GCAGCCGCAT CGCCCGCCGA 60 

CTCCTGCTTG GATTCCTTAG CACCACCACA 120 

ATTGGCACAT ACCGCCATGT TGACCGTGCC 180 

CCAGCAGGAA CCTATGTCTC TGAGCATTGT " 240 

TGCCCTGTGG GGACCTTTAC CAGGCATGAG 300 

CAGCCATGCC CATGGCCAAT GATTGAGAAA #360 

TGCACTTGCC CACCTGGCAT GTTCCAGTCT 420 

CCTGTGGGTT GGGGTGTGCG GAAGAAAGGG 480 

TGTGCTCGGG GTACCTTCTC AGATGTGCCT 540 

GACTGTCTGA GTCAGAACCT GGTGGTGATC 600 

TGTGGCACAC TCCCGTCCTT CTCCAGCTCC 660 

CCACGCCCTG AGCACATGGA AACCCATGAA 720 

ATGAACTCAA CAGAATCCAA CTCTTCTGCC 780 

CAGGAAGGGA CAGTCCCTGA CAACACAAGC 840 

ACCCTCCCAA ACCTTCAGGT AGTCAACCAC 900 

AAGCTGCTGC OGTCCATGGA GGCCACTGGG 960 

CCCAAGAGGG GACATCCTAG ACAGAACCTA 1020 

CCCTGGATGA TTGTGCTTTT CCTGCTGCTG 1080 

CGGAAAAGCT CGAGGACTCT GAAAAAGGGG 1140 

AAGGCAGGGC TGAAGAAATC CATGACTCCA 1200 

TGCAATGGCC ATGGTATCGA TATCCTGAAG 1260 

AAAGATATCT ATCAGTTTCT TTGCAATGCC 1320 

GGGTACACAG CCGACCACGA GCGGGCCTAC 1380 

CCCGAGGCCA GCCTCGCCCA GCTAATTAGC 1440 

GTGGAGAAGA TTCGTGGGCT GATGGAAGAC 1500 

CTCCCGATGA GCCCCAGCCC GCTTAGCCCG 1560 

GAGAATTCCG CTCTCCTGAC GGTGGAGCCT 1620 

GTGGATGAGT CGGAGCCCCT TCTCCGCTGT 1680 

AGCAGGAACG GTTCCTTTAT TACCAAAGAA 1740 

CTGGACCCCT GTGACTTGCA GCCTATCTTT 1800 

GAGCTGCGGG TGATTGAAGA GATTCCCCAG 1860 

ATTATTGGAG TCAAGAGCCA GGAAGCCAGC 1920 
CTTCCTGACC TGCTGTAG 



Seq ID NO: 477 Protein sequence 
Protein Accession fl: NP_055267.1 

1 11 21 31 41 51 

I I I I I I 

MGTSPSSSTA LASCSRIARR ATATMIAGSL LLLGFLSTTT AQPEQKASNL IGTYRHVDRA 60 

TGQVLTCDKC PAGTYVSEHC TNTSLRVCSS CPVGTFTRHE NGIEKCHDCS QPCPWPMIEK 120 

LPCAALTDRE CTCPPGMFQS NATCAPHTVC PVGWGVRKKG TETEDVRCKQ CARGTFSDVP 180 

SSVMKCKAYT DCLSQNLWI KPGTKETDNV CGTLPSFSSS TSPSPGTAIF PRPEHMETHE 24 0 

VPSSTYVPKG MNSTESNSSA SVRPKVLSSI QEGTVPDNTS SARGKEDVNK TLPNLQVVNH 300 

QQGPHHRHIL KLLPSMEATG GEKSSTPIKG PKRGHPRQNL HKHFDINEHL PWMIVLFLLL 360 

VLWIWCSI RKSSRTLKKG PRQDPSAIVE KAGLKKSMTP TQNREKWIYY CNGHGIDILK 420 

LVAAQVGSQW KDIYQFLCNA SEREVAAFSN GYTADHERAY AALQHWTIRG PEASLAQLIS 480 

ALRQHRRNDV VEKIRGLMED TTQLETDKLA IiPMSPSPLSP SPIPSPNAKIj ENSALLTVEP 540 

SPQDKNKGFF VDESEPLLRC DSTSSGSSAL SRNGSFITKE KKDTVLRQVR LDPCDLQPIF 600 
DDMLHFLNPE ELRVIEEIPQ AEDKLDRLFE IZGVKSQEAS QTLLDSVYSH LPDLL 



Seq ID NO: 478 DMA sequence 
Nucleic Acid Accession #: XM_044533 
Coding sequence: 238. .2751 

1 11 21 31 ' 41 51 

I I I I I I 

GCTCTGCCCA AGCCGAGGCT GCGGGGCCGG CGCCGGCGGG AGGACTGCGG TGCCCCGCGG 60 

AGGGGCTGAG TTTGCCAGGG CCCACTTGAC CCTGTTTCCC ACCTCCCGCC CCCCAGGTCC 120 

GGAGGCGGGG GCCCCCGGGG CGACTCGGGG GCGGACCGCG GGGCGGAGCT GCCGCCCGTG 180 

AGTCCGGCCG AGCCACCTGA GCCCGAGCCG CGGGACACCG TCGCTCCTGC TCTCCGAATG 240 

CTGCGCACCG CGATGGGCCT GAGGAGCTGG CTCGCCGCCC CATGGGGOGC GCTGCCGCCT 300 

CGGCCACCGC TGCTGCTGCT CCTGCTGCTG CTGCTCCTGC TGCAGCCGCC GCCTCCGACC 360 

TGGGCGCTCA GCCCCCGGAT CAGCCTGCCT CTGGGCTCTG AAGAGCGGCC ATTCCTCAGA 42 0 

TTCGAAGCTG AACACATCTC CAACTACACA GCCCTTCTGC TGAGCAGGGA TGGCAGGACC 480 

CTGTACGTGG GTGCTCGAGA GGCCCTCTTT GCACTCAGTA GCAACCTCAG CTTCCTGCCA 540 

GGCGGGGAGT ACCAGGAGCT GCTTTGGGGT GCAGACGCAG AGAAGAAACA GCAGTGCAGC 600 

TTCAAGGGCA AGGACCCACA GCGCGACTGT CAAAACTACA TCAAGATCCT CCTGCCGCTC 660 

AGCGGCAGTC ACCTGTTCAC CTGTGGCACA GCAGCCTTCA GCCCCATGTG TACCTACATC 720 

AACATGGAGA ACTTCACCCT GGCAAGGGAC GAGAAGGGGA ATGTCCTCCT GGAAGATGGC 780 

AAGGGCCGTT GTCCCTTCGA CCCGAATTTC AAGTCCACTG CCCTGGTGGT TGATGGCGAG 840 

CTCTACACTG GAACAGTCAG CAGCTTCCAA GGGAATGACC CGGCCATCTC GCGGAGCCAA 900 

AGCCTTCGCC CCACCAAGAC CGAGAGCTCC CTCAACTGGC TGCAAGACCC AGCTTTTGTG 960 

GCCTCAGCCT ACATTCCTGA GAGCCTGGGC AGCTTGCAAG GCGATGATGA CAAGATCTAC 1020 

TTTTTCTTCA GCGAGACTGG CCAGGAATTT GAGT TCT TTG AGAACACCAT TGTGTCCCGC 1080 

ATTGCCCGCA TCTGCAAGGG CGATGAGGGT GGAGAGCGGG TGCTACAGCA GCGCTGGACC 1140 

TCCTTCCTCA AGGCCCAGCT GCTGTGCTCA CGGCCCGACG ATGGCTTCCC C TTCAA CGTG 1200 

CTGCAGGATG TCTTCACGCT GAGCCCCAGC CCCCAGGACT GGCGTGACAC CCTTTTCTAT 1260 

GGGGTCTTCA CTTCCCAGTG GCACAGGGGA ACTACAGAAG GCTCTGCCGT CTGTGTCTTC 1320 

ACAATGAAGG ATGTGCAGAG AGTCTTCAGC GGCCTCTACA AGGAGGTGAA CCGTGAGACA 1380 

CAGCAGTGGT ACACCGTGAC CCACCCGGTG CCCACACCCC GGCCTGGAGC GTGCATCACC 1440 

AACAGTGCCC GGGAAAGGAA GATCAACTCA TCCCTGCAGC TCCCAGACCG CGTGCTGAAC 1500 

TTCCTCAAGG ACCACTTCCT GATGGACGGG CAGGTCCGAA GCCGCATGCT GCTGCTGCAG 1560 

CCCCAGGCTC GCTACCAGCG CGTGGCTGTA CACCGCGTCC CTGGCCTGCA CCACACCTAC 1620 

GATGTCCTCT TCCTGGGCAC TGGTGACGGC CGGCTCCACA AGGCAGTGAG CGTGGGCCCC 1680 

CGGGTGCACA TCATTGAGGA GCTGCAGATC TTCTCATCGG GACAGCCCGT GCAGAATCTG 1740 



369 



WO 02/086443 

CTCCTGGACA CCCACAGGGG GCTGCTGTAT GCGGCCTCAC ACTOGGGCGT AGTCCAGGTG 1800 

CCCATGGCCA ACTGCAGCCT GTACAGGAGC TGTGGGGACT GCCTCCTCGC COGGGACCCC 1860 

TACTGTGCTT GGAGCGGCTC CAGCTGCAAG CACGTCAGCC TCTACCAGCC TCAGCTGGCC 1920 

ACCAGGCCGT GGATCCAGGA CATCGAGGGA GCCAGCGCCA AGGACCTTTG CAGCGCGTCT 1980 

TCGGTTGTGT CCCCGTCTTT TGTACCAACA GGGGAGAAGC CATGTGAGCA AGTCCAGTTC 2040 

CAGCCCAACA CAGTGAACAC TTTGGCCTGC CCGCTCCTCT CCAACCTGGC GACCCGACTC 2100 

TGGCTACGCA ACGGGGCCCC CGTCAATGCC TCGGCCTCCT GCCACGTGCT ACCCACTGGG 2160 

GACCTGCTGC TGGTGGGCAC CCAACAGCTG GGGGAGTTCC AGTGCTGGTC ACTAGAGGAG 2220 

GGCTTCCAGC AGCTGGTAGC CAGCTACTGC CCAGAGGTGG TGGAGGACGG GGTGGCAGAC 2280 

CAAACAGATG AGGGTGGCAG TGTACCCGTC ATTATCAGCA CATOGCGTGT GAGTGCACCA 2340 

GCTGGTGGCA AGGCCAGCTG GGGTGCAGAC AGGTCCTACT GGAAGGAGTT CCTGGTGATG 2400 

TGCACGCTCT TTGTGCTGGC CGTGCTGCTC CCAGTTTTAT TCTTGCTCTA CCGGCACGGG 2460 

AACAGCATGA AAGTCTTCCT GAAGCAGGGG GAATGTGCCA GCGTGCACCC CAAGACCTGC 2520 

CCTGTGGTGC TGCCCCCTGA GACCCGCCCA CTCAACGGCC TAGGGCCCCC TAGCACCCCG 2580 

CTCGATCACC GAGGGTACCA GTCCCTGTCA GACAGCCCCC CGGGGTCCCG AGTCTTCACT 2640 

GAGTCAGAGA AGAGGCCACT CAGCATCCAA GACAGCTTCG TGGAGGTATC CCCAGTGTGC 2700 

CCCCGGCCCC GGGTCCGCCT TGGCTCGGAG ATCCGTGACT CTGTGGTGTG AGAGCTGACT 2760 

TCCAGAGGAC GCTGCCCTGG CTTCAGGGGC TGTGAATGCT CGGAGAGGGT CAACTGGACC 2820 

TCCCCTCCGC TCTGCTCTTC GTGGAACACG ACCGTGGTGC CCGGCCCTTG GGAGCCTTGG 2880 

GGCCAGCTGG CCTGCTGCTC TCCAGTCAAG TAGCGAAGCT CCTACCACCC AGACACCCAA 2940 

ACAGCCGTGG CCCCAGAGGT CCTGGCCAAA TATGGGGGCC TGCCTAGGTT GGTGGAACAG 3000 

TGCTCCTTAT GTAAACTGAG CCCTTTGTTT AAAAAACAAT TCCAAATGTG AAACTAGAAT 3060 

GAGAGGGAAG AGATAGCATG GCATGCAGCA CACACGGCTG CTCCAGTTCA TGGCCTCCCA 3120 

GGGGTGCTGG GGATGCATCC AAAGTGGTTG TCTGAGACAG AGTTGGAAAC CCTCACCAAC 3180 

TGGCCTCTTC ACCTTCCACA TTATCCCGCT GCCACCGGCT GCCCTGTCTC ACTGCAGATT 3240 

CAGGACCAGC TTGGGCTGCG TGCGTTCTGC CTTGCCAGTC AGCCGAGGAT GTAGTTGTTG 3300 

CTGCCGTCGT CCCACCACCT CAGGGACCAG AGGGCTAGGT TGGCACTGCG GCCCTCACCA 3360 

GGTCCTGGGC TCGGACCCAA CTCCTGGACC TTTCCAGCCT GTATCAGGCT GTGGCCACAC 3420 

GAGAGGACAG CGCGAGCTCA GGAGAGATTT CGTGACAATG TACGCCTTTC CCTCAGAATT 3480 

CAGGGAAGAG ACTGTCGCCT GCCTTCCTCC GTTGTTGCGT GAGAACCCGT GTGCCCCTTC 3540 

CCACCATATC CACCCTCGCT CCATCTTTGA ACTCAAACAC GAGGAACTAA CTGCACCCTG 3600 

GTCCTCTCCC CAGTCCCCAG TTCACCCTCC ATCCCTCACC TTCCTCCACT CTAAGGGATA 3660 

TCAACACTGC CCAGCACAGG GGCCCTGAAT TTATGTGGTT TTTATACATT TTTTAATAAG 3720 
ATGCACTTTA TGTCATTTTT TAATAAAGTC TGAAGAATTA CTGTTT 

Seq ID NO: 479 Protein sequence 
Protein Accession fl: XP_044533.3 

1 11 21 31 41 51 

i I I I I I 

MLRTAMGLRS WLAAPWGALP PRPPLLLLLL LLLLLQPPPP TWALSPRISL PLGSEERPPL 60 

RFEAEHISNY TALLLSRDGR TLYVGAREAL FALSSNLSFL PGGEYQELLW GADAEKKQQC 120 

SFKGKDPQRD CQNYIKILLP LSGSHLFTCG TAAFSPMCTY INMENPTLAR DEKGNVLLED 180 

GKGRCPFDPN FKSTALWDG ELYTGTVSSF QGNDPAISRS QSLRPTKTES SLNWLQDPAF 240 

VASAYIPESL GSLQGDDDKI YFFFSETGQE FEFFENTIVS RIARICKGDE GGERVLQQRW 300 

TSFLKAQUjC SRPDDGFPFN VLQDVFTLSP SPQDWRDTLF YGVFTSQWHR GTTEGSAVCV 360 

FTMKDVQRVF SGLYKEVNRE TQQWYTVTHP VPTPRPGACI TNSARERKIN SSLQLPDRVI> 420 

NFLKDHFLMD GQVRSRMLLL QPQARYQRVA VHRVPGLHHT YDVLFLGTGD GRLHKAVSVG 480 

PRVHIIEELQ IFSSGQPVQN LLLDTHRGLL YAASHSGWQ VPMANCSLYR SCGDCLLARD 540 

PYCAWSGSSC KHVSLYQPQL ATRPWIQDIE GASAKDLCSA SSWSPSFVP TGEKPCEQVQ 600 

FQPNTVNTLA CPLLSNLATR LWLRNGAPVN ASASCHVLPT GDLLLVGTQQ LGEFQCWSLE 660 

EGFQQLVASY CPEWEDGVA DQTDEGGSVP VIISTSRVSA PAGGKASWGA DRSYWKEFLV 720 

MCTLFVLAVL LPVLFLLYRH RNSMKVFLKQ GECASVHPKT CPWLPPETR PLNGLGPPST 780 
PLDHRGYQSL SDSPPGSRVF TESEKRPLSI QDSFVEVSPV CPRPRVRLGS EIRDSW 

Seq ID NO: 480 DNA sequence 

Nucleic Acid Accession ft: NM_004217.1 

Coding sequence: 58., 10 92 

1 11 21 31 41 51 

GGCCGGGAGA GTAGCAGTGC CTTGGACCCC AGCTCTCCTC CCCCTTTCTC TCTAAGGATG 60 

GCCCAGAAGG AGAACTCCTA CCCCTGGCCC TACGGCCGAC AGACGGCTCC ATCTGGCCTG 120 

AGCACCCTGC CCCAGCGAGT CCTCCGGAAA GAGCCTGTCA CCCCATCTGC ACTTGTCCTC 180 

ATGAGCCGCT CCAATGTCCA GCCCACAGCT GCCCCTGGCC AGAAGGTGAT GGAGAATAGC 240 

AGTGGGACAC CCGACATCTT AACGCGGCAC TTCACAATTG ATGACTTTGA GATTGGGCGT 300 

CCTCTGGGCA AAGGCAAGTT TGGAAACGTG TACTTGGCTC GGGAGAAGAA AAGCCATTTC 360 

ATCGTGGCGC TCAAGGTCCT CTTCAAGTCC CAGATAGAGA AGGAGGGCGT GGAGCATCAG 420 

CTGCGCAGAG AGATCGAAAT CCAGGCCCAC CTGCACCATC CCAACATCCT GCGTCTCTAC 480 

AACTATTTTT ATGACCGGAG GAGGATCTAC TTGATTCTAG AGTATOCCCC CCGCGGGGAG 540 

CTCTACAAGG AGCTGCAGAA GAGCTGCACA TTTGACGAGC AGCGAACAGC CACGATCATG 600 

GAGGAGTTGG CAGATGCTCT AATGTACTGC CATGGGAAGA AGGTGATTCA CAGAGACATA 660 

AAGCCAGAAA ATCTGCTCTT AGGGCTCAAG GGAGAGCTGA AGATTGCTGA CTTCGGCTGG 720 

TCTGTGCATG CGCCCTCCCT GAGGAGGAAG ACAATGTGTG GCACCCTGGA CTACCTGCCC 780 

CCAGAGATGA TTGAGGGGOG CATGCACAAT GAGAAGGTGG ATCTGTGGTG CATTGGAGTG 840 

CTTTGCTATG AGCTGCTGGT GGGGAACCCA CCCTTTGAGA GTGCATCACA CAACGAGACC 900 

TATCGCCGCA TCGTCAAGGT GGACCTAAAG TTCCCCGCTT CTGTGCCCAC GGGAGCCCAG 960 

GACCTCATCT CCAAACTGCT CAGGCATAAC CCCTCGGAAC GGCTGCCCCT GGCCCAGGTC 1020 

TCAGCCCACC CTTGGGTCCG GGCCAACTCT CGGAGGGTGC TGCCTCCCTC TGCCCTTCAA 10 BO 

TCTGTCGCCT GATGGTCCCT GTCATTCACT CGGGTGCGTG TGTTTGTATG TCTGTGTATG 1140 

TATAGGGGAA AGAAGGGATC CCTAACTGTT CCCTTATCTG TTTTCTACCT CCTCCTTTGT 1200 
TTAATAAAGG CTGAAGCTTT TTGT 

Seq ID NO: 481 Protein sequence 
Protein Accession ft: NPJJ04208 

1 11 21 31 41 51 



370 



10 
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WO 02/086443 
I I I I 

MAQKENSYPW PYGRQTAPSG LSTLPQRVLR KEPVTPSALV 
SSGTPDILTR HFTIDDFEIG RPLGKGKFGN VYLAREKKSH 
QLRRBIEIQA HLHHPNILRL YNYFYDRRRI YLILEYAPRG 
MEELADALMY CHGKKVIHRD IKPENLLLGL KGELKIADFG 
PPEMIEGRMH NEKVDLWCIG VLCYELLVGN PPFESASHNE 
QDLISKLLRH NPSERLPLAQ VSAHPWVRAN SRRVLPPSAL 

Seq ID NO: 482 DNA sequence 
Nucleic Acid Accession #: AK055663 
Coding sequence: 38.. 1423 



PC1YUS02/12476 



LMSRSNVQPT 


AAPGQKVMEN 


60 


r IVAIiKvijr K, 




120 


ELYKELQKSC 


TFDEQRTATI 


180 


WSVHAPSLRR 


KTMCGTLDYL 


240 


TYRRIVKVDL 


KPPASVPTGA 


300 


QSVA 







AGAACGGCTT 
AAAACCACAA 
CCGAAGGTCC 
GCTTATGTGG 
TTTTGATCTT 
TAGCCCTGTC 
AGTCTTGGCA 
ACAGCCCGAG 
CCTGTTCAOG 
TACGAGCTGG 
GGGACTTAGC 
AGCATTTGCT 
CACTGCCTCT 
GTACAGTGGG 
ACTCATCAGA 
GACCCTAGGT 
TGAACAAATG 
TGTTCAAATT 
CAATGTCCTA 
TGATTTGAAC 
ATTTAACACT 
TTATGGTTTT 
TGGAGTTCCA 
TAGATATGGA 
TTATAAGGAA 
TGTTTAATCA 
TATGAAACTA 
GCTTTAAATA 
GTTTTGTAGT 
AGATCTGTCA 
TGCCACTGTG 
CTTAGTTTTT 
ATGCAGTGGC 
TGCCTCAGCC 
TGTATTTTTA 
ACCTCATGAT 
CCTGGCCGAT 
GGGAAAGGGA 
AATTGCTAAA 
TTTTTAGCAG 

GATTTTTGTT 



11 
I 

CCGGCGGGAG 
AGATCCTTTT 
TGGAAGATAC 
TGCAGTTCTA 
TTTAGTTTAA 
TATTCATTTG 
CAGTTGGGAG 
ATACACACGG 
ATGCTTTCTA 
CTTCAAGAGC 
AGTATCTTCC 
CTTTGTATTA 
GCTATAGCTA 
AAAGTCTTAC 
GAGGTATCTA 
TTTGGCTCAT 
GTTCTTGCTC 
TTCAAGGATG 
AACTTTTCAG 
CCAGTTACAT 
CCTGGGAAAA 
GGTCTCAATC 
GGAATTGGAG 
ACTAATAATA 
TATTGACTCC 
TTTACTCTAA 
TATTTTTGTA 
GGCTTCCTTT 
TGACTGCAGT 
CATTACTAAG 
CCCGGCCAAT 
GTTTTGTTTT 
ATGATCTCAG 
TCCCGAGTAG 
GTAAAGACGG 
CCACCCACCT 
ATTTTCTTTA 
AAAATGTCTG 
TTTTTCTTTG 
AAATTTTGGA 
AAAGTTTCTC 



21 

I 

CTGTGCAGCT 
TTGGCAAGTT 
TGCTCTTTGG 
CTAATAGTAT 
TGACATGTTT 
GGTTTGAAAG 
CTCTCTTTAT 
GAAGATTATT 
TTCGGAATAA 
ATGTTGCAGA 
TTCCCCGAAT 
CATATATGCT 
TTGCCTTGAT 
TCCAGACAAC 
CCTTAGATGG 
TGGCTGGATC 
ATGTGACCAA 
ACTGGATTAG 
ATCATCACGT 
CAACTCCAGC 
ATGTGAACCC 
ATGGACACAC 
CAACTCAAGG 
GAATTGGACA 
TTGGCTTCCA 
ATGTTAGATA 
AAATGTATTT 
AGAAAATGTG 
GTGATGTGAC 
ATACGATATT 
ACATTATTAT 
GTTTTTTGAG 
CTCACTGCAA 
CTGGGATTAC 
GGGATTTCAC 
TAGCCTCCCA 
ATGAAATTTA 
TTCAAAAAGT 
AGGTTCTCCT 
ATACATTCTA 
TCCTTTAAAA 



31 
I 

CCTTATCATG 
GTTACGGGAA 
TGTAATAAAC 
AGCTTTAACT 
AATAAGTTAC 
ATTAGAAGTC 
ATTAAAAGAA 
AGTTGGTACT 
ACCTTTTGCT 
TCTTAGTCGA 
GAATCCATTT 
CATTGAAATT 
GACATTTGGC 
ACCACCCCAT 
AGTTTTAGAA 
AGTGCATGTA 
CAGGCTGTAC 
GCCTGCCTTA 
AATCCCAATG 
TAAACCTAGT 
AGTTATTCTT 
ACCTTACAGC 
ATTGAGGACT 
ACCAAGACCA 
ATTTATTTAG 
ATAGTAGTCT 
GTGACAGTGA 
TTTCTTTAAA 
CTTACCTTTA 
TCTTTTTTTT 
TAACTTAAGG 
ATGGAGTCTC 



AGGCACCTGC 
CATGTTGGCC 
AAGTGCTGGG 
TAAATATGCT 
AAAGGTCTCT 
GAATTATGTC 
TCTAGCACAA 
ATTTTAGTAC 



41 

I 

GGGACAATTC 
TTTAGACTTG 
TTGATATGTA 
GCCTATACTT 
TGGGTAACAT 
CTGGCTGTAT 
AGTGCAGAAC 
TTTGTGGCTC 
TATGTCTCAG 
AGCTTGTGTG 
GTTTTGATTG 
AATAATTATT 
ACTATGTATC 
GTTATTGGTC 
GTCCGAAATG 
AGAATTCGAC 
ACTCTAGTGT 
TTGTCTGGGC 
OCrCTTTTAA 
AGTCCACCTC 
CTAAACACAC 
AGCATGCTTA 
GGTTTTACAA 
TGATAGACTC 
TAATCCAACT 
TGTTCACATT 
AATCCTCGTA 
TTTGGATTTT 
TAAGAGCCAC 
TCCGAGACGG 
CTGTACTTTA 
ACTCTGTCGC 
CTGAGTTCAA 
CACCACGCCC 
AGGCTGGTCT 
ATTAGGTGTG 
TCTTGAATAA 
TTTATAGCTT 
TTACAAACTA 
TTTGAATTTT 
ATTTGTAAAT 



Seq ID NO: 483 Protein sequence 
Protein Accession #s BAB70980.1 



MGTIHLFRKP 
TAYTYLTIFD 
ESAERFLEQP 
RSLCGIIPGL 
GTMYPMSVYS 
VRIRRDANEQ 
MPLLRGTDDL 
SSMLNQGLGV 



11 
I 

QRSFFGKLLR 
LFSLMTCLIS 
EIHTGRLLVG 
SSIFLPRMNP 
GKVLLQTTPP 
MVLAHVTNRL 
NPVTSTPAKP 
PGIGATQGLR 



21 
I 

EFRLVAADRR 
YWVTLRKPSP 
TFVALCFNIiF 
FVLIDLAGAF 
HVIGQLDKLI 
YTLVSTLTVQ 
SSPPPEFSFN 
TGFTNIPSRY 



31 
I 

SWKILLFGVI 
VYSFGFERLE 
TMLSIRNKPF 
ALCITYMLIE 
REVSTLDGVL 
IFKDDWIRPA 
TPGKNVNPVI 
GTNNRIGQPR 



41 
I 

NLICTGFLLM 
VLAVFASTVL 
AYVSEAASTS 
INNYFAVDTA 
EVRNEHFWTL 
LLSGPVAANV 
LLNTQTRPYG 
P 



51 
I 

ATCTCTTTCG 
TAGCAGCTGA 
CTGGCTTCCT 
ACCTGACCAT 
TGAGGAAACC 
TTGCCTCCAC 
GCTTTTTGGA 
TTTGTTTCAA 
AAGCTGCTAG 
GAATTATTCC 
ATCTTGCTGG 
TTGCCGTAGA 
CCATGAGTGT 
AGTTGGACAA 
AACATTTTTG 
GAGATGCCAA 
CTACTCTAAC 
CTGTTGCAGC 
AGGGTACTGA 
CAGAATTTTC 
AAACAAGGCC 
ATCAAGGACT 
ATATACCAAG 
TAACTTATTT 
TTGCATTGAC 
TCATGAAACC 
AATGTTAAAG 
GGTATCTTTG 
TTGATGGAGT 
AGTCTTGCTC 
TTAAGGCTTC 
CCAGGCTGGA 
ATGATTCTCC 
AGCTAATTTT 
TGAACTCCTG 
AGCCACCGCA 
TACACATTTT 
TTCCAAACTT 
AAAGCAAAAA 
TAATTATCAA 



51 

I 

WCSSTNSIAL 
AQLGALFILiK 
WLQEHVADLS 
SAIAIALMTF 
GFGSLAGSVH 
LNFSDHHVIP 
FGLNHGHTPY 



Seq ID NO: 484 DNA sequence 

Nucleic Acid Accession #: FGENESH predicted 

Coding sequence: 1..900 



1 
I 

ATGCCGCCGC 
CCGCGGCGGC 
GCCGTGGGCA 
CGGCCCACTG 
GGCTGCGGCG 
GGACCCCGGG 
CTTCCTAACT 
CCGGTGCGCA 
CTTTGCTACC 
TTTCAAAACA 
GTGCTGCTGG 



11 

I 

GGGAGCTGAG 
GTAGCGCGCC 
AGAGCAGCCT 
CGCTGGACAC 
GGGCTGTGCA 
GAGGAGACTG 
CAGGCTCTCC 
TTGAGCTCTG 
CGGATACCGA 
TCACAGAGAA 
TGGGCACCCA 



21 
I 

CGAGGCCGAG 
CCCAGAGCTG 
CATCGTCAGC 
CTTCTCTGGT 
CCGGGGAGCT 
GAGCAGGCCC 
CCGCCCCGCC 
GGACACAGCG 
TGTCTTCCTG 
ATGGCTGCCC 
GGCCGACCTG 



31 
I 

CCGCCCCCGC 
GGCATCAAGT 
TACACCTGCA 
ACGTACGTTC 
GGGGCGGGCG 
CGAGGTGGCG 
CCTGCAGTGC 
GGACAGGAGG 
GCGTGCTTCA 
GAGATCCGCA 
AGGGACGATG 



41 
I 

TCCGGGCCCC 
GCGTGCTGGT 
ATGGGTACCC 
AATCGCCCGT 
TCTCGGCGGG 
CTGGTGCGGC 
AAGTCCTGGT 
ATTTTGACCG 
GCGTGGTGCA 
CGCACAACCC 
TCAACGTACT 



51 
I 

GACCCCTCCC 
GGGCGACGGC 
CGCGCGCTAC 
GCGGCCGCGT 
AGGGCGCAGA 
CCAGGACGCT 
GGATGGAGCT 
ACTTCGTTCC 
GCCCAGCTCC 
CCAGGCGCCT 
AATTCAGCTG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1060 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 



60 
120 
180 
240 
300 
360 
420 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
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5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 
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GACCAGGGGG GCCGGGAGGG CCCOOTGCCC CAACCCCAGG CTCAGGGTCT GCCCGAGAAG 
ATCCGAGCCT GCTGCTACCT TGAGTGCTCA GCCTTGACGC AGAAGAACTT GAAGGAAGTA 
TTTGACTCGG CTATTCTCAG TGCCATTGAG CACAAAGCCC GGCTGGAGAA GAAACTGAAT 
GCCAAAGGTG TGCGCACCCT CTCCCGCTGC CGCTGGAAGA AGTTCTTCTG CTTCGTTTGA 

Seq ID NO i 485 Protein sequence 
Protein Accession 8: PGENESH predicted 

1 11 21 31 41 51 

I I I I I I 

MPPRELSEAE PPPLRAPTPP PRRRSAPPEL GIKCVLVGDG AVGKSSLIVS YTCNGYPARY 
RPTALDTFSG TYVQSPVRPR GCGGAVHRGA GAGVSAGGRR GPRGGDWSRP RGGAGAAQDA 
LPNSGSPRPA PAVQVLVDGA PVRIELWDTA GQEDFDRLRS LCYPDTDVPL ACFSWQPSS 
FQNITEKWLP EIRTHNPQAP VLLVGTQADL RDDVNVLIQL DQGGREGPVP QPQAQGLAEK 
IRACCYLECS ALTQKNLKEV FDSAILSAIE HKARLEKKLN AKGVRTLSRC RWKKFFCFV 



PCT/US02/12476 



720 
780 
B40 



Seq ID NO: 486 DNA sequence 

Kucleic Acid Accession ft: XMJJ63832.2 

Coding sequence: 1..711 



ATGCCGCCGC 
CCGCGGCGGC 
GCCGTGGGCA 
CGGCCCACTG 
ATTGAGCTCT 
CCGGATACCG 
ATCACAGAGA 
GTGGGCACCC 
GGCCGGGAGG 
TGCTGCTACC 
GCTATTCTCA 
GTGCGCACCC 



11 
I 

GGGAGCTGAG 
GTAGCGCGCC 
AGAGCAGCCT 
CGCTGGACAC 
GGGACACAGC 
ATGTCTTCCT 
AATGGCTGCC 
AGGCCGACCT 
GCCCCGTGCC 
TTGAGTGCTC 
GTGCCATTGA 
TCTCCCGCTG 



21 
I 

CGAGGCCGAG 
CCCAGAGCTG 
CATCGTCAGC 
CTTCTCTGTG 
GGGACAGGAG 
GGCGTGCTTC 
CGAGATCCGC 
GAGGGACGAT 
CCAACCCCAG 
AGCCTTGACG 
GCACAAAGCC 
CCGCTGGAAG 



31 

I 

CCGCCCCCGC 
GGCATCAAGT 
TACACCTGCA 
CAAGTCCTGG 
GATTTTGACC 
AGCGTGGTGC 
ACGCACAACC 
GTCAACGTAC 
GCTCAGGGTC 
CAGAAGAACT 
CGGCTGGAGA 
AAGTTCTTCT 



41 

I 

TCCGGGCCCC 
GCGTGCTGGT 
ATGGGTACCC 
TGGATGGAGC 
GACTTCGTTC 
AGCCCAGCTC 
CCCAGGCGCC 
TAATTCAGCT 
TGGCCGAGAA 
TGAAGGAAGT 
AGAAACTGAA 
GCTTCGTTTG 



SI 
I 

GACCCCTCCC 
GGGCGACGGC 
CGCGCGCTAC 
TCCGGTGCGC 
CCTTTGCTAC 
CTTTCAAAAC 
TGTGCTGCTG 
GGACCAGGGG 
GATCCGAGCC 
ATTTGACTCG 
TGCCAAAGGT 
A 



Seq ID NO: 487 Protein sequence 
Protein Accession #: XP_063832.1 



1 11 21 31 41 51 

I I I I I I 

MPPRELSEAE PPPLRAPTPP PRRRSAPPEL GIKCVLVGDG AVGKSSLIVS YTCNGYPARY 
RPTALDTFSV QVLVDGAPVR IELWDTAGQE DFDRLRSLCY PDTDVFLACF SWQPSSFQN 
ITEKWLPEIR THNPQAPVLL VGTQADLRDD VNVLIQLDQG GREGPVPQPQ AQGLAEKIRA 
CCYLECSALT QKNLKEVFDS AILSAIEHKA RLEKKLNAKG VRTLSRCRWK KFFCFV 

Seq ID NO: 488 DNA sequence 

Nucleic Acid Accession #: NM_014398.1 

Coding sequence: 64.. 1314 



GGCACCGATT 
ACCATGCCCC 
CACGATGGCA 
ACTGCAGCAG 
CCTCACCAAA 
ACAGTAAAAA 
ATTACCTACA 
GTTACTGAAG 
CCACCAGCTC 
ACTCAACCCA 
ACAACCGGTC 
AATACCACCC 
CCATCGTCAG 
GCAGAGATGG 
TACTTCAACA 
AACCTTCTGT 
TCATATTATA 
CAAGGAATCA 
GTGAGTGAAC 
CTTCAAGCCT 
TACACAATTG 
GGTGTCTATA 
CCCGGGGGGA 
TTGGGAAATT 
AATGAAGTGA 
GTTTATTTTA 
GCCACTCAAA 
AGCCTTCAAA 
ATTTTATTTT 
GCTTTTACTA 
ACTCCTTTTC 
CAGGCTGGAG 
TGATTCTCCT 
GCTAATTTTT 
CTCTTGACCT 
AGCCATTGCG 



11 

I 

CGGGGCCTGC 
GGCAGCTCAG 
GTCAAATGAG 
CAACAGTACA 
CTTTAGCAGC 
TTCCAACAAC 
CCCTGGTCAC 
TTACAGTCGG 
ATACAGCTGG 
GTAACCAGAC 
AGAAGCCTGA 
GCACAGCTGC 
TCAAGACTGG 
GGATACAGCT 
TCGACCCCAA 
TGAATTTTCA 
TCAGTGAAGT 
AACATGCGGT 
AGAGCCTCCA 
TTGATTTTGA 
TGCTTCCTGT 
AAATCCGCCT 
ATGAAAATAA 
CCCTCAGAGT 
GTCATGTGTG 
TGAAAGATAT 
GTCAACATTT 
TTATAAACCA 
ACCCTTGATC 
TCTGTGTTTT 
CACTTTAAAT 
TACAGTGGCA 
GCTTCAGCTT 
GTATTTTTAT 
CAGGTGATCC 
CCCGGCCTTA 



21 
I 

CCGGACTTCG 
CGCGGCGGCC 
AGCAAAAGCA 
GGACATAAAA 
AAGATTCATG 
TACCCCAGCA 
AACCCAGGCC 
CCCTAGCTTA 
AACCAGTTCA 
CACCCTTCCA 
TCAACCCACC 
ACCTGCCTCC 
AATTTATCAG 
GATTGTTCAA 
CGCAACGCAA 
GGGCGGATTT 
GGGAGCCTAT 
GGTGATGTTC 
GTTGTCAGCC 
AGATGACCAC 
GATTGGGGCC 
AAGGTGTCAA 
TGGAATTTAG 
GTGGGTCCTT 
ATTTAAGTTC 
AGTGAGCTGT 
GAGATATGTT 
AGGGTCAATT 
TTAACAAAGC 
ATGGTTTCAT 
TTGTTTTTGT 
CGATCTCGGC 
CCCGAGTAGC 
TATAGACGGG 
ACCCACCTCA 
AATGTTTTTT 



31 
I 

CCGCACGCTG 
GCGCTCTTCG 
TTTCCAGAAA 
AAACCTGTCC 
GATGGTCATA 
ACTACAAAAA 
ACACCCAACA 
GCCCCTTATT 
TCAACCGTCA 
GCAACTTTAT 
CATGCCCCAG 
ACGGTTCCTG 
GTTCTAAACG 
GACAAGGAGT 
GCCTCTGGGA 
GTGAATCTCA 
TTGACCGTCT 
CAGACAGCAG 
CACCTGCAGG 
TTTGGAAATG 
ATCGTGGTTG 
TCATCTGGAT 
AGAACTCTTT 
CAAACAATGT 
AGGCAGCACA 
TTATTTTCTA 
GAATTAACAT 
GTAACTAATA 
CTTTGCTTTG 
GTAACATACA 
TTTTTGAGAC 
TTATGGCAAC 
TGGGATTACA 
TTTCACCATG 
GCCTCCCAAA 
TTAATCATCA 



41 
I 

CAGAACCTCG 
CGTCCCTGGC 
CCAGAGATTA 
AGCAACCAGC 
TCACCTTTCA 
ACACTGCAAC 
ACTCACACAC 
CACTGCCACC 
GCCACACAAC 
CGATAGCACT 
GAACAACGGC 
GGCCCACCCT 
GAAGCAGACT 
CGGTTTTTTC 
ACTGTGGCAC 
CATTTACCAA 
CAGATCCAGA 
TCGGGCATTC 
TGAAAACAAC 
TGGATGAGTG 
GTCTCTGCCT 
ACCAGAGAAT 
CATCCCTTCC 
AAACCACCAT 
TCAATTTCTA 
GTTTCCTTTA 
AATATATGTA 
CTACTGTGTG 
TTATCAAATG 
TATTCCTGGT 
GGAGTTTCAC 
CTCCGCCTCC 
GGCACACACT 
TTGGCCAGAC 
GTGCTGGGAT 
AAAAGAACAA 



51 
I 

CCCAGCGCCC 
CGTAATTTTG 
TTCTCAACCT 
TAAGCAAGCA 
AACAGCGGCC 
CACCAGCCCA 
AGCTCCTCCA 
CACCATCACC 
TGGGAACACC 
GCACAAAAGC 
AGCTGCCCAC 
TGCACCTCAG 
CTGTATAAAA 
ACCTCGGAGA 
CCGAAAATCC 
GGATGAAGAA 
GACAGTTTAC 
CTTCAAGTGC 
CGATGTCCAA 
CTCGTCTGAC 
TATGGGTATG 
CTAATTGTTG 
AGGATGGATG 
CTTCTATTCA 
AATACTTTTT 
GAATATTTTA 
AAGTAGAATA 
TGCATTGAAG 
GACTTTCAGT 
GTAGCACTTA 
TCTTGTCACC 
CGGGTTCAAG 
ACCACGCCTG 
TGGTCTTGAA 
TACAGGCATG 
CATATCTCAG 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 



372 
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GTTGTCTAAG TGTTTTTATG TAAAACCAAC AAAAAGAACA AATCAGCTTA TATTTTTTAT 2220 

CTTGATGACT CCTGCTCCAG AATTGCTAGA CTAAGAATTA GGTGGCTACA GATGGTAGAA 2280 

CTAAACAATA AGCAAGAGAC AATAATAATG GCCCTTAATT ATTAACAAAG TGCCAGAGTC 2340 

TAGGCTAAGC ACTTTATCTA TATCTCATTT CATTCTCACA ACTTATAAGT GAATGAGTAA 2400 

ACTGAGACTT AAGGGAACTG AATCACTTAA ATGTCACCTG GCTAACTGAT GGCAGAGCCA 2460 

GAGCTTGAAT TCATGTTGGT CTGACATCAA GGTCTTTGGT CTTCTCCCTA CACCAAGTTA 2520 

CCTACAAGAA CAATGACACC ACACTCTGCC TGAAGGCTCA CACCTCATAC CAGCATACGC 2580 

TCACCTTACA GGGAAATGGG TTTATCCAGG ATCATGAGAC ATTAGGGTAG ATGAAAGGAG 2640 

AGCTTTGCAG ATAACAAAAT AGCCTATCCT TAATAAATCC TCCACTCTCT GGAAGGAGAC 2700 

TGAGGGGCTT TGTAAAACAT TAGTCAGTTG CTCATTTTTA TGGGATTGCT TAGCTGGGCT 2760 

GTAAAGATGA AGGCATCAAA TAAACTCAAA GTATTTTTAA ATTTTTTTGA TAATAGAGAA 2820 

ACTTCGCTAA CCAACTGTTC TTTCTTGAGT GTATAGCCCC ATCTTGTGGT AACTTGCTGC 28 BO 

TTCTGCACTT CATATCCATA TTTCCTATTG TTCACTTTAT TCTGTAGAGC AGCCTGCCAA 2940 

GAATTTTATT TCTGCTGTTT TTTTTGCTGC TAAAGAAAGG AACTAAGTCA GGATGTTAAC 3000 

AGAAAAGTCC ACATAACCCT AGAATTCTTA GTCAAGGAAT AATTCAAGTC AGCCTAGAGA 3060 

CCATGTTGAC TTTCCTCATG TGTTTCCTTA TGACTCAGTA AGTTGGCAAG GTCCTGACTT 3120 
TAGTCTTAAT AAAACATTGA ATTGTAGTAA AGGTTTTTGC AATAAAAACT TACTCTGG 

Seq ID NO: 489 Protein sequence 
Protein Accession #: NP_055213.1 

1 H 21 31 41 51 

I I I I I I 

MPRQJJSAAAA lfaslavilh DGSQMRAKAF petrdysqpt AAATVQDIKK PVQQPAKQAP 60 

HQTLAARFMD GHITPQTAAT VKIPTTTPAT TKNTATTSPI TYTLVTTQAT PNNSHTAPPV 120 

TEVTVGPSLA PYSLPPTITP PAHTAGTSSS TVSHTTGNTT QPSNQTTIiPA TLSIALHKST 180 

TGQKPDQPTH APGTTAAAHN TTRTAAPAST VPGPTLAPQP SSVKTGIYQV LNGSRLCIKA 24 0 

EMGIQLIVQD KESVFSPRRY FNIDPNATQA SGNCGTRKSN ' LLLNFQGGPV NLTFTKDEES 300 

YYISEVGAYL TVSDPETVYQ GIKHAWMPQ TAVGHSFKCV SEQSLQLSAH LQVKTTDVQL 360 
QAFDFEDDHF GNVDECSSDY TIVLPVIGAI WGLCLMGMG VYKIRLRCQS SGYQRI 

Seq ID NO: 490 DNA sequence 

Nucleic Acid Accession #: NMJ)0S409.3 

Coding sequence: 94.. 37 8 

1 11 21 31 41 51 

11)111 

TTCCTTTCAT GTTCAGCATT TCTACTCCTT CCAAGAAGAG CAGCAAAGCT GAAGTAGCAG 60 

CAACAGCACC AGCAGCAACA GCAAAAAACA AACATGAGTG TGAAGGGCAT GGCTATAGCC 120 

TTGGCTGTGA TATTGTGTGC TACAGTTGTT CAAGGCTTCC CCATGTTCAA AAGAGGACGC 180 

TGTCTTTGCA TAGGCCCTGG GGTAAAAGCA GTGAAAGTGG CAGATATTGA GAAAGCCTCC 240 

ATAATGTACC CAAGTAACAA CTGTGACAAA ATAGAAGTGA TTATTACCCT GAAAGAAAAT 300 

AAAGGACAAC GATGCCTAAA TCCCAAATCG AAGCAAGCAA GGCTTATAAT CAAAAAAGTT 360 

GAAAGAAAGA ATTTTTAAAA ATATCAAAAC ATATGAAGTC CTGGAAAAGG GCATCTGAAA 420 

AACCTAGAAC AAGTTTAACT GTGACTACTG AAATGACAAG AATTCTACAG TAGGAAACTG 480 

AGACTTTTCT ATGGTTTTGT GACTTTCAAC TTTTGTACAG TTATGTGAAG GATGAAAGGT 540 

GGGTGAAAGG ACCAAAAACA GAAATACAGT CTTCCTGAAT GAATGACAAT CAGAATTCCA 600 

CTGCCCAAAG GAGTCCAGCA ATTAAATGGA TTTCTAGGAA AAGCTACCTT AAGAAAGGCT 660 

GGTTACCATC GGAGTTTACA AAGTGCTTTC ACGTTCTTAC TTGTTGTATT ATACATTCAT 720 

GCATTTCTAG GCTAGAGAAC CTTCTAGATT TGATGCTTAC AACTATTCTG TTGTGACTAT 780 

GAGAACATTT CTGTCTCTAG AAGTTATCTG TCTGTATTGA TCTTTATGCT ATATTACTAT 840 

CTGTGGTTAC AGTGGAGACA TTGACATTAT TACTGGAGTC AAGCCCTTAT AAGTCAAAAG 900 

CATCTATGTG TCGTAAAGCA TTCCTCAAAC ATTTTTTCAT GCAAATACAC ACTTCTTTCC 960 

CCAAATATCA TGTAGCACAT CAATATGTAG GGAAACATTC TTATGCATCA TTTGGTTTGT 1020 

TTTATAACCA ATTCATTAAA TGTAATTCAT AAAATGTACT ATGAAAAAAA TTATACGCTA 1080 

TGGGATACTG GCAACAGTGC ACATATTTCA TAACCAAATT AGCAGCACCG GTCTTAATTT 1140 

GATGTTTTTC AACTTTTATT CATTGAGATG TTTTGAAGCA ATTAGGATAT GTGTGTTTAC 12 0O 

TGTACTTTTT GTTTTGATCC GTTTGTATAA ATGATAGCAA TATCTTGGAC ACATTTGAAA 1260 

TACAAAATGT TTTTGTCTAC CAAAGAAAAA TGTTGAAAAA TAAGCAAATG TATACCTAGC 1320 

AATCACTTTT ACTTTTTGTA ATTCTGTCTC TTAGAAAAAT ACATAATCTA ATCAATTTCT 1380 

TTGTTCATGC CTATATACTG TAAAATTTAG GTATACTCAA GACTAGTTTA AAGAATCAAA 1440 
GTCATTTTTT TCTCTAATAA ACTACCACAA CCTTTCTTTT TTAAAAAAAA AAA 

Seq ID NO: 491 Protein sequence 
Protein Accession #: NP_005400.1 

1 11 21 31 41 51 

III III 

MSVKGMAIAL AVILCATWQ GFPMFKRGRC LCIGPGVKAV KVADIEKASI MYPSNNCDKI 60 

EVIITLKENK GQRCLNPKSK QARLIIKKVE RKNF 

Seq ID NO: 492 DNA sequence 

Nucleic Acid Accession #: NM_000577.1 

Coding sequence*. 41.. 520 

1 11 21 31 41 51 

III III 

GGCACGAGGG GAAGACCTCC TGTCCTATCA GGCCCTCCCC ATGGCTTTAG AGACGATCTG 60 

CCGACCCTCT GGGAGAAAAT CCAGCAAGAT GCAAGCCTTC AGAATCTGGG ATGTTAACCA 120 

GAAGACCTTC TATCTGAGGA ACAACCAACT AGTTGCCGGA TACTTGCAAG GACCAAATGT 180 

CAATTTAGAA GAAAAGATAG ATGTGGTACC CATTGAGCCT CATGCTCTGT TCTTGGGAAT -240 

CCATGGAGGG AAGATGTGCC TGTCCTGTGT CAAGTCTGGT GATGAGACCA GACTCCAGCT 300 

GGAGGCAGTT AACATCACTG ACCTGAGCGA GAACAGAAAG CAGGACAAGC GCTTCGCCTT 360 

CATCCGCTCA GACAGTGGCC CCACCACCAG TTTTGAGTCT GCCGCCTGCC CCGGTTGGTT 420 

CCTCTGCACA GCGATGGAAG CTGACCAGCC CGTCAGCCTC ACCAATATGC CTGACGAAGG 480 

CGTCATGGTC ACCAAATTCT ACTTCCAGGA GGACGAGTAG TACTGCCCAG GCCTGCCTGT 540 

TCCCATTCTT GCATGGCAAG GACTGCAGGG ACTGCCAGTC CCCCTGCCCC AGGGCTCCCG 600 



373 
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GCTATGGGGG CACTGAGGAC CAGCCATTGA GGGGTGGACC CTCAGAAGGC GTCACAACAA 660 

CCTGGTCACA GGACTCTGCC TCCTCTTCAA CTGACCAGCC TCCATGCTGC CTCCAGAATG 720 

GTCTTTCTAA TGTGTGAATC AGAGCACAGC AGCCCCTGCA CAAAGCCCTT CCATGTCGCC 780 

TCTGCATTCA GGATCAAACC CCGACCACCT GCCCAACCTG CTCTCCTCTT GCCACTGCCT 840 

CTTCCTCCCT CATTCCACCT TCCCATGCCC TGGATCCATC AGGCCACTTG ATGACCCCCA 900 

ACCAAGTGGC TCCCACACCC TGTTTTACAA AAAAGAAAAG ACCAGTCCAT GAGGGAGGTT 960 

TTTAAGGGTT TGTGGAAAAT GAAAATTAGG ATTTCATGAT TTTTTTTTTT CAGTCCCCGT 1020 

GAAGGAGAGC CCTTCATTTG GAGATTATGT TCTTTCGGGG AGAGGCTGAG GACTTAAAAT 1080 

ATTCCTGCAT TTGTGAAATG ATGGTGAAAG TAAGTGGTAG CTTTTCCCTT CTTTTTCTTC 1140 

TTTTTTTGTG ATGTCCCAAC TTGTAAAAAT TAAAAGTTAT GGTACTATGT TAGCCCCATA 1200 

ATTTTTTTTT TCCTTTTAAA ACACTTCCAT AATCTGGACT CCTCTGTCCA GGCACTGCTG 1260 

CCCAGCCTCC AAGCTCCATC TCCACTCCAG ATTTTTTACA GCTGCCTGCA GTACTTTACC 1320 

TCCTATCAGA AGTTTCTCAG CTCCCAAGGC TCTGAGCAAA TGTGGCTCCT GGGGGTTCTT 1380 

TCTTCCTCTG CTGAAGGAAT AAATTGCTCC TTGACATTGT AGAGCTTCTG GCACTTGGAG 1440 

ACTTGTATGA AAGATGGCTG TGCCTCTGCC TGTCTCCCCC ACCAGGCTGG GAGCTCTGCA 1500 

GAGCAGGAAA CATGACTCGT ATATGTCTCA GGTCCCTGCA GGGCCAAGCA CCTAGCCTCG 1560 

CTCTTGGCAG GTACTCAGCG AATGAATGCT GTATATGTTG GGTGCAAAGT TCCCTACTTC 1620 

CTGTGACTTC AGCTCTGTTT TACAATAAAA TCTTGAAAAT GCCTAAAAAA AAAAAAAAAA 1680 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAA 

Seq ID NO: 493 Protein sequence 
Protein Accession #: NPJ)00568.1 

1 11 21 31 41 51 

1 I I I I I . 

MALETICRPS GRKSSKMQAF RIWDVNQKTF YLRNNQLVAG YLQGPNVNLE EKIDWPIEP 60 

HALFLGIHGG KMCLSCVKSG DETRLQLEAV NITDLSENRK QDKRFAFIRS DSGPTTSFES 120 
AACPGWFLCT AMEADQPVSL TNMPDEGVMV TKFYFQEDE 

Seq ID NO: 494 DNA sequence 

Nucleic Acid Accession fh NM_002081.1 

Coding sequence: 222.. 1898 

1 11 21 31 41 51 

I I I I I I 

GGCTGCCCGA GCGAGCGTTC GGACCTCGCA CCCCGCGCGC CCCGCGCCGC CGCCGCCGCC 60 

GGCTTTTGTT GTCTCCGCCT CCTCGGCCGC CGCCGCCTCT GGACCGCGAG CCGCGCGCGC 120 

CGGGACCTTG GCTCTGCCCT TCGCGGGCGG GAACTGCGCA GGACCCGGCC AGGATCCGAG 180 

AGAGGCGCGG GCGGGTGGCC GGGGGCGCCG CCGGCCCCGC CATGGAGCTC CGGGCCCGAG 240 

GCTGGTGGCT GCTATGTGCG GCCGCAGCGC TGGTCGCCTG CGCCCGCGGG GACCCGGCCA 300 

GCAAGAGCCG GAGCTGCGGC GAGGTCCGCC AGATCTACGG AGCCAAGGGC TTCAGCCTGA 360 

GCGACGTGCC CCAGGCGGAG ATCTCGGGTG AGCACCTGCG GATCTGTCCC CAGGGCTACA 420 

CCTGCTGCAC CAGCGAGATG GAGGAGAACC TGGCCAACCG CAGCCATGCC GAGCTGGAGA 480 

CCGCGCTCCG GGACAGCAGC CGCGTCCTGC AGGCCATGCT TGCCACCCAG CTGCGCAGCT 540 

TCGATGACCA CTTCCAGCAC CTGCTGAACG ACTCGGAGCG GACGCTGCAG GCCACCTTCC 600 

CCGGCGCCTT CGGAGAGCTG TACACGCAGA ACGCGAGGGC CTTCCGGGAC CTGTACTCAG 660 

AGCTGCGCCT GTACTACCGC GGTGCCAACC TGCACCTGGA GGAGACGCTG GCCGAGTTCT 720 

GGGCCCGCCT GCTCGAGCGC CTCTTCAAGC AGCTGCACCC CCAGCTGCTG CTGCCTGATG 780 

ACTACCTGGA CTGCCTGGGC AAGCAGGCCG AGGCGCTGCG GCCCTTCGGG GAGGCCCCGA 840 

GAGAGCTGCG CCTGCGGGCC ACCCGTGCCT TCGTGGCTGC TCGCTCCTTT GTGCAGGGCC 900 

TGGGCGTGGC CAGCGACGTG GTCCGGAAAG TGGCTCAGGT CCCCCTGGGC CCGGAGTGCT 960 

CGAGAGCTGT CATGAAGCTG GTCTACTGTG CTCACTGCCT GGGAGTCCCC GGCGCCAGGC 1020 

CCTGCCCTGA CTATTGCCGA AATGTGCTCA AGGGCTGCCT TGCCAACCAG GCCGACCTGG 1080 

ACGCCGAGTG GAGGAACCTC CTGGACTCCA TGGTGCTCAT CACCGACAAG TTCTGGGGTA 1140 

CATCGGGTGT GGAGAGTGTC ATCGGCAGCG TGCACACGTG GCTGGCGGAG GCCATCAACG 1200 

CCCTCCAGGA CAACAGGGAC ACGCTCACGG CCAAGGTCAT CCAGGGCTGC GGGAACCCCA 1260 

AGGTCAACCC CCAGGGCCCT GGGCCTGAGG AGAAGCGGCG CCGGGGCAAG CTGGCCCCGC 1320 

GGGAGAGGCC ACCTTCAGGC ACGCTGGAGA AGCTGGTCTC TGAAGCCAAG GCCCAGCTCC 1380 

GCGACGTCCA GGACTTCTGG ATCAGCCTCC CAGGGACACT GTGCAGTGAG AAGATGGCCC 1440 

TGAGCACTGC CAGTGATGAC CGCTGCTGGA ACGGGATGGC CAGAGGCCGG TACCTCCCCG 1500 

AGGTCATGGG TGACGGCCTG GCCAACCAGA TCAACAACCC CGAGGTGGAG GTGGACATCA 1560 

CCAAGCCGGA CATGACCATC CGGCAGCAGA TCATGCAGCT GAAGATCATG ACCAACCGGC 1620 

TGCGCAGCGC CTACAACGGC AACGACGTGG ACTTCCAGGA CGCCAGTGAC GACGGCAGCG 1680 

GCTCGGGCAG CGGTGATGGC TGTCTGGATG ACCTCTGCGG CCGGAAGGTC AGCAGGAAGA 1740 

GCTCCAGCTC CCGGACGCCC TTGACCCATG CCCTCCCAGG CCTGTCAGAG CAGGAAGGAC 1800 

AGAAGACCTC GGCTGCCAGC TGCCCCCAGC CCCCGACCTT CCTCCTGCCC CTCCTCCTCT 1860 

TCCTGGCCCT TACAGTAGCC AGGCCCCGGT GGCGGTAACT GCCCCAAGGC CCCAGGGACA 1920 

GAGGCCAAGG ACTGACTTTG CCAAAAATAC AACACAGACG ATATTTAATT CACCTCAGCC 1980 

TGGAGAGGCC TGGGGTGGGA CAGGGAGGGC CGGCGGCTCT GAGCAGGGGC AGGCGCAGAG 2040 

GTCCCAGCCC CAGGCCTGGC CTCGCCTGCC TTTCTGCCTT TTAATTTTGT ATGAGGTCCT 2100 

CAGGTCAGCT GGGAGCCAGT GTGCCCAAAA GCCATGTATT TCAGGGACCT CAGGGGCACC 2160 

TCCGGCTGCC TAGCCCTCCC CCCAGCTCCC TGCACCGCCG CAGAAGCAGC CCCTCGAGGC 2220 

CTACAGAGGA GGCCTCAAAG CAACCCGCTG GAGCCCACAG CGAGCCTGTG CCTTCCTCCC 2280 

CGCCTCCTCC CACTGGGACT CCCAGCAGAG CCCACCAGCC AGCCCTGGCC CACCCCCCAG 2340 

CCTCCAGAGA AGCCCCGCAC GGGCTGTCTG GGTGTCCGCC ATCCAGGGTC TGGCAGAGCC 2400 

TCTGAGATGA TGCATGATGC CCTCCCCTCA GCGCAGGCTG CAGAGCCCGG CCCCACCTCC 2460 

CTGCGCCCTT GAGGGGCCCC AGCGTCTGCA GGGTGACGCC TGAGACAGCA CCACTGCTGA 2520 

GGAGTCTGAG GACTGTCCTC CCACAGACCC TGCAGTGAGG GGCCCTCCAT GCGCAGATGA 2580 

GGGGCCACTG ACCCACCTGC GCTTCTGCTG GAGGAGGGGA AGCTGGGCCC AAAGGCCCAG 2640 

GGAGGCAGCG TGGGCTCTGC CAATGTGGGC TGCCCCTCGC ACACAGGGCT CACAGGGCAG 2700 

GCCTTGCTGG GGTCCAGGGC TGTTGGAGGA CCCCGAGGGC TGAGGAGCAG CCAGGACCCG 2760 

CCTGCTCCCA TCCTCACCCA GATCAGGAAC CAGGGCCTCC CTGTTCACGG TGACACAGGT 2820 

CAGGGCTCAG AGTGACCCTC GGCTGTCACC TGCTCACAGG GATGCTGGTG GCTGGTGAGA 2880 

CCCCGCACTG CACACGGGAA TGCCTAGGTC CCTTCCCGAC CCAGCCAGCT GCACTGCAGG 2940 

GCACGGGGAC CTGGATAGTT AAGGGCTTTT CCAAACATGC ATCCATTTAC TGACACTTCC 3000 

TGTCCTTGTT CATGGAGAGC TGTTCGCTCC TCCCAGATGG CTTCGGAGGC CCGCAGGGCC 3060 

CACCTTGGAC CCTGGTGACC TCCTGTCACT CACTGAGGCC ATCAGGGCCC TGCCCCAGGC 3120 



374 



10 
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CTGGACGGGC CCTCCTTCCC TCCTGTGCCC CAGCTGCCAG GTGGCCCTGG GGAGGGGTGG 3X80 

TGTGGTGTTG GGAAGGGGTC CTGCAGGGGG AGGAGGACTT GGAGGGTCTG GGGGCAGCTG 3240 

TCCTGAACCG ACTGACCCTG AGGAGGCCGC TTAGTGCTGC TTTGCTTTTC ATCACCGTCC 3300 

CGCACAGTGG ACGGAGGTCC COGGTTGCTG GTCAGGTCCC CATGGCTTGT TCTCTGGAAC 3360 

CTGACTTTAG ATGTTTTGGG ATCAGGAGCC CCCAACACAG GCAAGTCCAC CCCATAATAA 3420 

CCCTGCCAGT GCCAGGGTGG GCTGGGGACT CTGGCACAGT GATGCCGGGC GCCAGGACAG 3480 

CAGCACTCCC GCTGCACACA GACGGCCTAG GGGTGGCGCT CAGACCCCAC CCTACGCTCA 3540 

TCTCTGGAAG GGGCAGCCCT GAGTGGTCAC TGGTCAGGGC AGTGGCCAAG CCTGCTGTGT 3600 

CCTTCCTCCA CAAGGTCCCC CCACCGCTCA GTGTCAGCGG GTGACGTGTG TTCTTTTGAG 3660 
TCCTTGTATG AATAAAAGGC TGGAAACCTA AA 



PCT/US02/12476 



Seq ID NO: 495 Protein sequence 
Protein Accession 8: NP 002072.1 



15 



20 



25 



i 
I 

MELRARGWWL 
ICPQGYTCCT 
TLQATFPGAP 
QLLLPDDYLD 
PLGPECSRAV 
TDKFWGTSGV 
RGKLAPRERP 
RGRYLPEVMG 
ASDDGSGSGS 



11 
I 

LCAAAALVAC 
SEMEENLANR 
GELYTQNARA 
CLGKQAEALR 
MKLVYCAHCL 
ESVIGSVHTW 
PSGTLEKLVS 
DGLANQINNP 



21 

I 

ARGDPASKSR 
SHAELETALR 
FRDLYSELRL 
PFGEAPRELR 
GVPGARPCPD 
LAEAINALQD 
EAKAQLRDVQ 
EVEVDITKPD 



31 
I 

SCGEVRQIYG 
DSSRVLQAMIi 
YYRGANLHLE 
LRATRAFVAA 
YCRNVLKGCL 
NRDTLTAKVI 
DFWISLPGTL 
MTIRQQIMQJj 



41 
1 

ARGFSLSDVP 
ATQLRSFDDH 
ETLAEFWARL 
RSFVQGLGVA 
AKQADLDAEW 
QGCGNPKVNP 
CSEKMALSTA 
KIMTNRLRSA 



51 
I 

QAEISGEHLR 
FQHLLNDSER 
LERLFKQLHP 
SDWRKVAQV 
RNLLDSMVLI 
QGPGPEEKRR 



YNGNDVDFQD 



60 
120 
180 
240 
300 
360 
420 
480 



30 
35 
40 
45 
50 
55 



Seq ZD NO: 496 dna sequence 

Nucleic Acid Accession ft: NMJ)01650.2 

Coding sequence: 40.1011 



GGGGCAGGCA 
AGGCGGTGGG 
GGGGTCTGGA 
TTTGTTCTCC 
GTCGACATGG 
TTTGGCCATA 
AGGAAGATCA 
ATTGGAGCAG 
ACCATGGTTC 
TTTCAATTGG 
TCAATAGCTT 
ACTGGTGCCA 
GAAAACCATT 
TATGAGTATG 
AAAGCTGCCC 
GAGACGGATG 
GAGGAGAAGA 
CGCACTGAAA 
GAAACAGATT 
GTCTAAACAA 
TCCAAATCTA 
TCTAGTTACC 
CCTGACAGAA 
AGTCAATTCT 



11 

I 

ATGAGAGCTG 
GTAAGTGTGG 
CTCAAGCTTT 
TCAGCCTGGG 
TTCTCATCTC 
TCAGCGGTGG 
GCATCGCCAA 
GAATCCTCTA 
ATGGAAATCT 
TGTTTACTAT 
TAGCAATTGG 
GCATGAATCC 
GGATATATTG 
TCTTCTGTCC 
AGCAAACAAA 
ACCTGATTCT 
AGGGGAAAGA 
GCAGACAAGA 
TGTTATAAAT 
TAAATATTTC 
AAAAAAGAAA 
TTTCATTAAC 
CTCAAAGACA 
TATTTGAATA 



21 

I 

CACTCTGGCT 
ACCTTTGTGT 
CTGGAAAGCA 
ATCCACCATC 
CCTTTGCTTT 
CCACATCAAC 
GTCTGTCTTC 
TCTGGTCACA 
TACCGCTGGT 
CTTTGCCAGC 
ATTTTCTGTT 
CGCCCGATCC 
GGTTGGGCCC 
AGATGTTGAA 
AGGAAGCTAC 
AAAACCTGGA 
CCAATCTGGA 
CTCCTTAGAA 
TAGAAATGTG 
ATAATTTACA 
TATTTTTAAG 
AACCAATTTT 
CGTCTATCAG 
TTTATTCTAT 



31 
I 

GGGGAAGGCA 
ACCAGAGAGA 
GTCACAGCGG 
AACTGGGGTG 
GGACTCAGCA 
CCTGCAGTGA 
TACATCGCAG 
CCTCCCAGTG 
CATGGTCTCC 
TGTGATTCCA 
GCAATTGGAC 
TTTGGACCTG 
ATCATAGGAG 
TTCAAACGTC 
ATGGAGGTGG 
GTGGTGCATG 
GAGGTATTGT 
CTGTCCTCAG 
CAGGTTTGTT 
AAGGAGGAAC 
ATGTTCTTAA 
AACCGTGTGT 
CTTATTCCTT 
TAAACTGAGT 



41 
I 

TGAGTGACAG 
ACATCATGGT 
AATTTCTGGC 
GAACAGAAAA 
TTGCAACCAT 
CTGTGGCCAT 
CCCAGTGCCT 
TGGTGGGAGG 
TGGTTGAGTT 
AACGGACTGA 
ATTTATTTGC 
CAGTTATCAT 

GTTTTAAAGA 
AGGACAACAG 
TGATTGACGT 
CTTCAGTATG 
ATTTCCTTCC 
GTTTCATGTC 
GGAAGAAACC 
GCAAATATAT 
CAAGATTTGG 
CTCTACTGGA 
TTAACAATGG 



51 
I 

ACCCACAGCA 
GGCTTTCAAA 
CATGCTTATT 
GCCTTTACCG 
GGTGCAGTGC 
GGTGTGCACC 
GGGGGCCATC 
CCTGGGAGTC 
GATAATCACA 
TGTCACTGGC 
AATCAATTAT 
GGGAAATTGG 
TGGTGGCCTT 
AGCCTTCAGC 
GAGTCAGGTA 
TGACCGGGGA 
ACTAGAAGAT 
ACCCATTAAG 
ATATTACTCA 
TATTGTGAAT 
ACCTATTTTA 
TTAAGTCTTG 
ATATTGGTAT 
C 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



60 
65 
70 



Seq ID NO: 497 Protein sequence 
Protein Accession .#: NP 001641.1 



MSDRPTARRW 
GTEKPLPVDM 
AQCLGAI IGA 
KRTDVTGSIA 
AVLAGGLYBY 
VIDVDRGEEK 



11 

I 

GKCGPLCTRE 
VLISLCFGLS 
GILYLVTPPS 
LAIGFSVAIG 
VFCPDVEFKR 
KGKDQSGEVL 



21 
I 

NIMVAFKGVW 
IATMVQCFGH 
WGGLGVTMV 
HLFAINYTGA 
RFKEAFSKAA 
SSV 



31 41 

I I 
TQAFWKAVTA EFLAMLIFVL 
ISGGHINPAV TVAMVCTRKI 
HGNLTAGHGL, LVELIITFQL 
SMNPARSFGP AVIMGNWENH 
QQTKGSYMEV EDNRSQVETD 



51 
I 

LSLGSTINWG 
SIAKSVFYIA 
VFTIFASCDS 
WIYWVGPIIG 
DIiILKPGWH 



Seq ID NO: 498 DNA sequence 
Nucleic Acid Accession Jh AB020684. 
Coding sequence: 1..1744 



60 
120 
180 
240 
300 



75 



80 



85 



i 

CCCCCTTGTC 
TTGGTACCGG 
GACGGTTACC 
TGCTTGCTTT 
CATATATGGC 
CTTTTTCAAT 
CTCATATCCA 
ACTTTATAGA 
GCAGTTTGCT 
CGGGTACATT 
ACTTTGTTTT 
TTTGGTAATT 



11 

I 

ATTAATACAT 
ATTTATACCA 
AGAGGAGAAG 
TATGTTGCTG 
ACATATTTAA 
CATGGAGAGT 
TTTCTTGTTC 
GGAAGCTTGA 
CAGTTTGTAC 
GATATATGTA 
GTTTTGATGT 
ATTTGGGGTA 



21 
I 

TAAAAAGATT 
AAATAATGGA 
GACTCAGTCC 
TAATTTTTAT 
GTGGCAGCCG 
GTACCCGTGT 
TTCAGATGTT 
TTGCACTCTG 
TTCTTACTCA 
AATTACGGAA 
TTGGGAACTC 
TTCTGGCAAT 



31 
I 

CAATCTTTAC 
CTTGATTGGT 
TATTGAAAGC 
TTTAAATGGA 
ATTAGGAGGC 
AATGTGGACA 
GCTAGTGACT 
CATTTCCAAT 
GATTGCATCA 
GATCATTTAT 
AATGTTATTA 
GAAACCACAT 



41 
I 

CCTGAGGTAA 
ATTCAAACCA 
TGTGAAGGAT 
CTAATGATGG 
CTGGTTACAG 
CCACCTCTCC 
CATATTCTCA 
GTATTTTTCA 
TTATTTGCAG 
ATACACATGA 
ACTTCTTATT 
TTCCTGAAAA 



51 
I 

TTTTGGCCAG 
AGATATGTTG 
TGGGAGATCC 
CATTATTCTT 
TGTTGTGCTT 
GTGAAAGCTT 
GGGCTACAAA 
TGCTTCCTTG 
TATATGTTGT 
TTTCTCTTGC 
ATGCTTCTTC 
TAAATGTATC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



375 
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TGAACTTAGT TTATGGGTTA TTCAAGGATG TTTTTGGTTA TTTGGAACTG TCATACTTAA 780 

ATACTTGACA TCTAAAATTT TTGGTATTGC AGATGACGCT CATATTGGCA ACTTACTAAC 840 

ATCAAAATTC TTTAGTTATA AGGATTTTGA TACTTTATTG TATACCTGTG CAGCQQAGTT 900 

TGACTTTATG GAAAAAGAGA CTCCACTGAG ATACACAAAG ACATTATTGC TTCCAGTTGT 960 

TCTTGTAGTG TTTGTTGCTA TTGTTAGAAA GATTATTAGT GATATGTGGG GTGTCTTAGC 1020 

TAAACAACAG ACACATGTAA GAAAACACCA GTTTGATCAT GGAGAGCTGG TTTACCATGC 1080 

ATTGCAATTG TTAGCATATA CAGCCCTTGG TATTTTAATT ATGAGACTAA AACTCTTCTT 1140 

GACACCACAC ATGTGTGTTA TGGCATCACT GATCTGCTCA AGACAGCTAT TTGGATGGCT 1200 

CTTTTGCAAA GTACATCCTG GTGCTATTGT GTTTGCTATA TTAGCAGCAA TGTCAATACA 1260 

AGGTTCAGCA AATCTGCAAA CCCAGTGGAA TATTGTAGGG GAGTTCAGCA ATTTGCCCCA 1320 

AGAAGAACTT ATAGAATGGA TCAAATATAG TACTAAACCA GATGCAGTGT TTGCGGGTGC 1380 

GATGCCCACG ATGGCAAGTG TTAAGCTCTC TGCACTTCGG CCCATTGTGA ATCATCCACA 1440 

TTATGAAGAC GCAGGCTTGA GAGCCAGAAC AAAAATAGTA TACTCAATGT ATAGTCGGAA 1500 

AGCAGCCGAA GAAGTGAAGC GAGAACTGAT AAAGTTAAAA GTGAACTATT ACATTCTAGA 1560 

AGAGTCATGG TGTGTAAGAA GATCCAAGCC TGGTTGCAGT ATGCCTGAAA TTTGGGATGT 1620 

AGAAGATCCT GCCAATGCTG GGAAAACTCC CTTATGTAAC CTCTTGGTGA AGGATTCCAA 1680 

ACCTCACTTC ACCACTGTAT TCCAGAACAG TGTTTACAAA GTCCTAGAAG TTGTAAAAGA 1740 

ATGACTGCTA CATGACCTGC TGCCTAOGGA GAACTACATC TGTAATGGTT TTAATGTTTT 1800 

GCTAAGTCAT GTGTTGTTCA TATCCCAAAA ACTTTTATAG GTAACTGTTT TCAAATAGAA 1860 

AACGTTTTAT TTGGTCAATT TGAATGTCAT TCTAATTATA AAAATGACTT ACACCTTTAT 1920 

CAATTGGTTA CTATTTCAAT GCACCCTTTA AAATTTGCTA TGCAAATGAG TATATGCTTG 1980 

TACTTGACTT TAATATTTGT GCTAAAGTGA GCAAAGCTAC CTGTATAAAG AAAACACAGT 2040 

GGGTTGTGAC AAGGATGACA TGAAAATACA GGACAATTCT GACAATGTAG GGGCTG ATTT 2100 

TATAGTGTAA GAACTATTAA TGCCCCTTGC TTCTTTTTTC TGCCTCTTGC TCTTGTCTTT 2160 

TGGACATTTC AGTGATTGTA AGTTCTTCGG TCATGTCAGC CCCTGTCATC AACTTGAGTT 2220 

ACAGTAGATG GGGCAGACAT GGAGTGTTTG CTATATAAAA CTATCTGTTT GTTTTACTTC 2280 

CTTGTGCGCT TTTTGTTCTC TGTTCTCTTG TTAATGAAGC TTTTCCTGCC CATTATTAAT 2340 

CCAAACTCTT GGACCTTGTG GTTAGGAAAT TCCCTTAACT TCCAGCCATA TGGCATTATC 2400 

GTGTCTCTTT CTCTCTCTCT CTTGCTCTCT CTCTTCTCCT CTTCCCCATA TTTTCTGTCA 2460 

AATAAGTACT GTTTACTCAT TTAGTTGCTT ATCAAGTACT TATTCTTGGT TTTAAAAAAA 2520 

ATTAATGGTA ACTGTATTTT TCTCATTTTT AGCATTATTC AAATGTTTAT ATTTTAATAC 2580 

CTTTAAACCA CTTTAAAGTT TTTTCATGTT TAATTATAGT TTTAAGAAAA ACTATTTTGA 2640 

ACAACCCCAA ATATAGTGCA TCTAGAAACT AATGTATATT TGATTAGACA TCATTTATAG 2700 

TGGAACAGTA GACTGTAGTA CATGGTAATT TTTCTTTTAC TATTAAGATA CAATAAAACA 2760 

TGACTAATTT TGCTGTCAAA AATGTAAAGA ATAATGATAA ATGGAGTTTT TTATATTTTA 2820 

CTTTTAAGAT TGCCTGTCTT TAATAAGACA AAGCCTTAAG CCTTATGTTA TAATTTTGGT 2880 

TCTAAAAACC ATCATTTCAG TATAAGGAAT AAGTATATTT CGTCCTCCTC TTTAGTTTTT 2 940 

TTCTTCCTAT TTATTTTTAT TTTGAAAAAT TTCTACACCT TCTTTGAATT CCTTGTATGA 3000 

ATTTTTGTTT CTTAGAAGTT AATTTGTGTG AAATGAGATT CTTCAAAACG ATGAAACCTC 3060 

ATAGCTCTGA GAAAAGGTTT TAGGGTTTTA AATTCTAAGC AAAGCGTGAC TATGGCTGAC 3120 

AGACTACACA TTTAATTATA CAGCTTCTCT TTCTTAACCA CAGGCAGATT AACCTCATTG 3180 

TGGATTGTCC TTCAGACCTT AGTCCTCAGG CATGGTTTCT GGTGCCCACT CCTGGAAGCC 3240 

GCTGTTCCCT TTCTACCTTC TTACCAGAGC CCAAGGGCAG GCCTGGTCCC GGGGAAGCAG 3300 

CAGCTTGCTG ACATAAGTCA GCTGCAAAGG CTGAGGAGTG TGCCCTCAGA GAAGCACCGC 3360 

CCCCCAGTCT TGTGCCAGCG CCTAGAGCCG CAGCTCCCAG GGATGCTCCT TCCCTGGAGG 3420 

' CAGCCCAGGA GAGGGACTCT GGCAGCGTTC TTCAGATTTG TGGCCACTGT TTCTCATTTG 3480 

CTGGTTGACT GTTTTTATTT CTTAGGCTTT TGCTAGTTTT AGAAAATAGG GAAGCAGCCC 3540 

TTGATTTGTG GATTAAAAGC AACATTTGAG CGATGATGCA CAACAGTCCA GGAAAATGGG 3600 

CGGTGGACAC TTGAGGCTGA GGATGGGAGT TGACATGAGC AGGGAGAGGG AGGTGCGCGC 3660 

TGCTTATCTG TGATTGTTGC TCACCTGAGT GTGGCTGATT GTGTACATCC AGCAGTTACA 3720 

ATTTTTAAAA ATTATACTTT TACATTTATT TTATATTTTT CTCACCCCCA GTAATTTCCT 3780 

TCCAAAGAAG TTCACATGTA ATAAGTAGAA ATTCTGTATA GGAAAAAAGC ATTAAAAATA 3840 

CTATTATAAC TGCTTCATTT GCTGGGAACC ATTAAAAGTA ATATAAATTA GCTTTTTCCA 3900 

GAAGGATCCT TTTGTAGCAG TGTTTATGAA TGTAACCCCC AGCAAAATAT GGCTATATAT 3960 

TAGGGGAGCC AGTTTGGAGC AGAGGCCTGA AGGTCCCTGC TATGCAGCCG TGGCCACAGC 4020 

TCGCAGCCCA AGCACTGTGG AGCATCCACA CCTTTGATGG CAATGCAGAT TGGTAGCAGG 4080 

TTCCATAGGC GTACAAAACA GTATTAAAGC TCAGTGTTTT GCATATTGTT AGCATTTACA 4140 

AATATTTTTG CTTTAGTATG AGGAAAGTAA GGATGGGCAA AGAAGCGATC AAAATAGCTA 4200 

TTGCTACAAC ATTTTCGAAA ACAAAGTTGG GGCTGTATTT CTTTAAAAAG ATAAGCCTCT 4260 

AAAAATGCTT GGCAAAAAAA ATATAGTGTT AAAATAGGCC AGTGATATTA ATGAGAAAAT 4320 

GAAAGTATGT ATCAGGAATA AAGTGATATT GCATAGGAGT ATTGTATTTT TATGAATTTT 4380 
ATGCCAGTTG TTTACATGTA CTATATATGT TAAATTAAAA AAAATCATGA GAAATG 

Seq ID NO: 499 Protein sequence 
Protein Accession #; BAA74900.1 

1 11 21 31 41 51 

I I I I I I 

PLVINTLKRF NLYPEVILAS WYRIYTKIMD LIGIQTKICW TVTRGEGLSP IESCEGLGDP 60 

ACFYVAVIFI LNGLMMALFF IYGTYLSGSR LGGLVTVLCF FFNHGECTRV MWTPPLRESF 120 

SYPFLVLQML LVTHILRATK LYRGSLIALC ISNVFFMLPW QFAQFVLLTQ IASLFAVYW 180 

GYIDICKLRK IIYIHMISLA LCFVLMFGNS MU/TSYYASS LVIIWGILAM KPHFLKINVS 240 

ELSLWVIQGC FWLFGTVILK YLTSKIFGIA DDAHIGNLLT SKFFSYKDFD TLLYTCAAEF 300 

DFMEKETPLR YTKTLLLPW LWFVAIVRK IISDMWGVLA KQQTHVRKHQ FDHGELVYHA 360 

LQLLAYTALG ILIMRLKLFL TPHMCVMASL ICSRQLFGWL FCKVHPGAIV FAILAAMSIQ 420 

GSANLQTQWN IVGEFSNLPQ EELIEWIKYS TKPDAVFAGA MPTMASVKLS ALRPIVNHPH 480 

YEDAGLRART KIVYSMYSRK AAEEVKRELI KLKVNYYILB ESWCVRRSKP GCSMPEIWDV 540 
EDPANAGKTP LCNLLVKDSK PHFTTVFQNS VYKVLEWKE 

Seq ID NO: 500 DNA sequence 

Nucleic Acid Accession ft: NM_001276.l 

Coding sequence: 127.. 1278 

1 U 21 31 41 51 

1 I I I I I 

AGTGGAGTGG GACAGGTATA TAAAGGAAGT ACAGGGCCTG GGGAAGAGGC CCTGTCTAGG 60 

TAGCTGGCAC CAGGAGCCGT GGGCAAGGGA AGAGGCCACA CCCTGCCCTG CTCTGCTGCA 120 



376 
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GCCAGAATGG GTGTGAAGGC GTCTCAAACA GGCTTTGTGG TCCTGGTGCT GCTCCAGTGC 180 

TGCTCTGCAT ACAAACTGGT CTGCTACTAC ACCAGCTGGT CCCAGTACCG GGAAGGOGAT 240 

GGGAGCTGCT TCCCAGATGC CCTTGACCGC TTCCTCTGTA CCCACATCAT CTACAGCTTT 300 

GCCAATATAA GCAACGATCA CATCGACACC TGGGAGTGGA ATGATGTGAC GCTCTACGGC 360 

5 ATGCTCAACA CACTCAAGAA CAGGAACCCC AACCTGAAGA CTCTCTTGTC TGTCGGAGGA 420 

TGGAACTTTG GGTCTCAAAG ATTTTCCAAG ATAGCCTOCA ACACCCAGAG TCGCCGGACT 480 

TTCATCAAGT CAGTACCGCC ATTCCTGOGC ACCCATGGCT TTGATGGGCT GGACCTTGCC 540 

TGGCTCTACC CTGGACGGAG AGACAAACAG CATTTTACCA CCCTAATCAA GGAAATGAAG 600 

GCCGAATTTA TAAAGGAAGC CCAGCCAGGG AAAAAGCAGC TCCTGCTCAG CGCAGCACTG 660 

10 TCTGCGGGGA AGGTCACCAT TGACAGCAGC TATGACAXTG CCAAGATATC CCAACACCTG 720 

GATTTCATTA GCATCATGAC CTACGATTTT CATGGAGCCT GGCGTGGGAC CACAGGCCAT 780 

CACAGTCCCC TGTTCCGAGG TCAGGAGGAT GCAAGTCCTG ACAGATTCAG CAACACTGAC 840 

TATGCTGTGG GGTACATGTT GAGGCTGGGG GCTCCTGCCA GTAAGCTGGT GATGGGCATC 900 

CCCACCTTCO GGAGGAGCTT CACTCTGGCT TCTTCTGAGA CTGGTGTTGG AGCCCCAATC 960 

15 TCAGGACCGG GAATTCCAGG CCGGTTCACC AAGGAGGCAG GGACCCTTGC CTACTATGAG 1020 

ATCTGTGACT TCCTCCGCGG AGCCACAGTC - CATAGAACCC TCGGCCAGCA GGTCCCCTAT 1080 

GCCACCAAGG GCAACCAGTG GGTAGGATAC GACGACCAGG AAAGCGTCAA AAGCAAGGTG 114,0 

CAGTACCTGA AGGATAGGCA GCTGGCAGGC GCCATGGTAT GGGCCCTGGA CCTGGATGAC 1200 

TTCCAGGGCT CCTTCTGCGG CCAGGATCTG CGCTTCCCTC TCACCAATGC CATCAAGGAT 1260 

20 GCACTCGCTG CAACGTAGCC CTCTGTTCTG CACACAGCAC GGGGGCCAAG GATGCCCCGT 1320 

CCCCCTCTGG CTCCAGCTGG CCGGGAGCCT GATCACCTGC CCTGCTGAGT CCCAGGCTGA 1380 

GCCTCAGTCT CCCTCCCTTG GGGCCTATGC AGAGGTCCAC AACACACAGA TTTGAGCTCA 1440 

GCCCTGGTGG GCAGAGAGGT AGGGATGGGG CTGTGGGGAT AGTGAGGCAT CGCAATGTAA 1500 

GACTCGGGAT TAGTACACAC TTGTTGATGA TTAATGGAAA TGTTTACAGA TCCCCAAGCC 1560 

25 TGGCAAGGGA ATTTCTTCAA CTCCCTGCCC CCTAGCCCTC CTTATCAAAG GACACCATTT 1620 

TGGCAAGCTC TATCACCAAG GAGCCAAACA TCCTACAAGA CACAGTGACC ATACTAATTA 1680 

TACCCCCTGC AAAGCCAGCT TGAAACCTTC ACTTAGGAAC GTAATCGTGT CCCCTATCCT 1740 

ACTTCCCCTT CCTAATTCCA CAGCTGCTCA ATAAAGTACA AGAGTTTAAC AGTGTGTTGG 1800 

CGCTTTGCTT TGGTCTATCT TTGAGCGCCC ACTAGACCCA CTGGACTCAC CTCCCCCATC 1860 

30 TCTTCTGGGT TCCTTCCTCT GAGCCTTGGG ACCCCTGAGC TTGCAGAGAT GAAGGCCGCC 1920 
ATGTT 



PCT/US02/12476 



35 
40 
45 
50 
55 
60 
65 



Seq ID NO: 501 Protein sequence 
Protein Accession ft: NP_0 0126 7.1 



MGVKASQTGF 
ISNDHIDTWE 
KSVPPFLRTH 
GKVTIDSSYD 
VGYMLRLGAP 
DPLRGATVHR 
GSFCGQDLRF 



11 
I 

WLVLLQCCS 
WNDVTLYGML 
GFDGLDLAWL 
IAKISQHLDF 
ASKIiVMGIPT 
TLGQQVPYAT 
PLTNAIKDAL 



21 
I 

AYKLVCYYTS 
NTLKNRNPNL 
YPGRRDKQHF 
ISIMTYDFHG 
FGRSFTLASS 
KGNQWVGYDD 
AAT 



31 
I 

WSQYREGDGS 
KTLLSVGGWN 
TTLIKEMKAE 
AWRGTTGHHS 
ETGVGAPISG 
QESVKSKVQY 



41 
I 

CFPDALDRFL 
FGSQRFSKIA 
FIKEAQPGKK 
PLFRGQEDAS 
PGIPGRFTKE 
LKDRQIAGAM 



51 
I 

CTHIIYSFAN 
SNTQSRRTFI 
QtiLLSAALSA 
PDRFSNTDYA 
AGTLAYYEIC 
VWALDLDDFQ 



Seq ID NO: 502 DNA sequence 

Nucleic Acid Accession #: NMJ)06474. 

Coding sequence: 181.. 669 



GCTGCCTAGG 
TCCGGCCCCC 
TTCCCCCAGC 
ATGTGGAAGG 
GAAGGAGCCA 
GTTGCCATGC 
AAGTCTGGCT 
GAGGATCTGC 
GCCTCAAACG 
GTTGAGAAAG 
GCCATCGGTT 
TCGCCCTAAA 
TTCTGACTCT 
CGGGCCCATT 
TCACCAGATT 



11 
I 

GTCTGGAAAG 
CCACCGTCGC 
TCAGAATCTT 
TGTCAGCTCT 
GCACAGGCCA 
CAGGTGCCGA 
TGACAACTCT 
CAACTTCAGA 
TGGCCACCAG 
ATGGTTTGTC 
TCATTGGTGG 
GAGCTGAAGG 
GTGGCCCTGT 
CAGATTCCAC 
TGGTTCTTAA 



21 
I 

CTCGGGCACC 
GCTCCTCCAG 
GCTGCTCGGC 
GCTCTTCGTT 
GCCAGAAGAT 
AGATGATGTG 
GGTGGCAACA 
AAGCACAGTC 
TCACTCCACG 
AACAGTGACC 
AATCATCGTT 
GTTACGCCCT 
CCCTGAGCTC 
GGTGACTTTC 
ACTTT 



31 
I 

CTCCCTCTCC 
GCTGGGCCTG 
CCCCAGGAGA 
TTGGGAAGCG 
GACACTGAGA 
GTGACTCCAG 
AGTGTCAACA 
CACGCGCAAG 
GAGAAAGTGG 
CTGGTTGGAA 
GTGGTTATGC 
GCTTGCCAAC 
GTGGGGAGAA 
CGTTTGCCAA 



41 
I 

GGGGCTCCTG 
TGGCCGCGGT 
GCAACAACTC 
CGTCGCTCTG 
CTACAGGTTT 
GAACCAGCGA 
GTGTAACAGG 
AACAAAGTCC 
ATGGAGACAC 
TCATAGTTGG 
GAAAAATGTC 
GTGCTTTAAA 
GATGACCCTG 
ATTAACCGAG 



51 
I 

CTCCCACCCC 
GCTTTTAATT 
AACGGGAACG 
GGTCCTGGCA 
GGAAGGCGGC 
AGACCGCTAT 
CATTCGCATC 
AAGCGCCACA 
ACAGACAACA 
GGTCTTACTA 
GGGAAGGTAC 
AAAAGACCGT 
GGAACATTTG 
GAAAGACCTT 



60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



70 
75 
80 
85 



Seq ID NO: 503 Protein sequence 
Protein Accession fh NPJ306465.1 

1 11 21 31 41 51 

I I I I I I 

MWKVSALLFV LGSASLWVLA EGASTGQPED DTETTGLEGG VAMPGAEDDV VTPGTSEDRY 
KSGLTTLVAT SVNSVTGIRI EDLPTSESTV HAQEQSPSAT ASNVATSHST EKVDGDTQTT 
VEKDGLSTVT LVGIIVGVLL AIGFIGGIIV WMRKMSGRY SP 

Seq ID NO: 504 DNA sequence . 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 62.. 8 95 



11 

I 



21 



31 



41 



51 



CACTGCTCTG AGAATTTGTG AGCAGCCCCT AACAGGCTGT TACTTCACTA CAACTGACGA 
TATGATCATC TTAATTTACT TATTTCTCTT GCTATGGGAA GACACTCAAG GATGGGGATT 
CAAGGATGGA ATTTTTCATA ACTCCATATG GCTTGAACGA GCAGCCGGTG TGTACCACAG 
AGAAGCACGG TCTGGCAAAT ACAAGCTCAC CTACGCAGAA GCTAAGGCGG TGTGTGAATT 
TGAAGGCGGC CATCTCGCAA CTTACAAGCA GCTAGAGGCA GCCAGAAAAA TTGGATTTCA 



60 
120 



60 
120 
180 
240 
300 



377 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

TGTCTGTGCT GCTGGATGGA TGGCTAAGGG CAGAGTTGGA TACCCCATTG TGAAGCCAGG 
GCCCAACTGT GGATTTGGAA AAACTGGCAT TATTGATTAT GGAATCCGTC TCAATAGGAG 
TGAAAGATGG GATGCCTATT GCTACAACCC ACAOGCAAAG GAGTGTGGTG GOGTCTTTAC 
AGATCCAAAG CAAATTTTTA AATCTCCAGG CTTCCCAAAT GAGTACGAAG ATAACCAAAT 
CTGCTACTGG CACATTAGAC TCAAGTATGG TCAGCGTATT CACCTGAGTT TTTTAGATTT 
-TGACCTTGAA GATGACCCAG GTTGCTTGGC TGATTATGTT GAAATATATG ACAGTTACGA 
TGATGTCCAT GGCTTTGTGG GAAGATACTG TGGAGATGAG CTTCCAGATG ACATCATCAG 
TACAGGAAAT GTCATGACCT TGAAGTTTCT AAGTQATGCT TCAGTGACAG CTGGAGGTTT 
CCAAATCAAA TATGTTGCAA TGGATCCTGT ATCCAAATCC AGTCAAGGAA AAAATACAAG 
TACTACTTCT ACTGGAAATA AAAACTTTTT AGCTGGAAGA TTTAGCCACT TATAAAAAAA 
AAAAAAAGGA TGATCAAAAC ACACAGTGTT TATGTTGGAA TCTTTTGGAA CTCCTTTGAT 
CTCACTGTTA TTATTAACAT TTATTTATTA TTTTTCTAAA TGTGAAAGCA ATACATAATT 
TAGGGAAAAT TGGAAAATAT AGGAAACTTT AAACGAGAAA ATGAAACCTC TCATAATCCC 
ACTGCATAGA AATAACAAGC GTTAACATTT TCATATTTTT TTCTTTCAGT CATTTTTCTA 
TTTGTGGTAT ATGTATATAT GTACCTATAT GTATTTGCAT TTGAAATTTT GGAATCCTGC 
TCTATGTACA GTTTTGTATT ATACTTTTTA AATCTTGAAC TTTATAAACA TTTTCTGAAA 
TCATTGATTA TTCTACAAAA ACATGATTTT AAACAGCTGT AAAATATTCT ATGATATGAA 
TGTTTTATGC ATTATTTAAG CCTGTCTCTA TTGTTGGAAT TTCAGGTCAT TTTCATAAAT 
ATTGTTGCAA TAAATATCCT TGAACACACA AAAAAAAAAA AA 

Seg ID NO: 505 Protein sequence 
Protein Accession #: Eos sequence 



PCT7US02/12476 



MIILIYLFLL 
EGGHLATYKQ 
ERWDAYCYNP 
DLEDDPGCLA 
QIKYVAMDPV 



11 
I 

LWEDTQGWGF 
LEAARKIGFH 
HAKECGGVFT 
DYVEIYDSYD 
SKSSQGKNTS 



21 31 41 51 

I 1 I I 

KDGIFHNSIW LERAAGVYHR EARSGKYKLT YAEAKAVCEF 
VCAAGWMAKG RVGYPIVKPG PNCGFGKTGI IDYGIRLNRS 
DPKQIFKSPG FPNEYEDNQI CYWHIRLKYG QRIHLSFLDF 
DVHGFVGRYC GDELPDDIIS TGNVMTLKFL SDASVTAGGF 
TTSTGNKNFL AGRFSHL 



Seq ID NO: 506 DNA sequence 

Nucleic Acid Accession ft: NMJJ0711S.1 

Coding sequence: 6 9.. 902 



GAATTCGCAC 
CTGACGATAT 
GGGGATTCAA 
ACCACAGAGA 
GTGAATTTGA 
GATTTCATGT 
AGCCAGGGCC 
ATAGGAGTGA 
TCTTTACAGA 
ACCAAATCTG 
TAGATTTTGA 
GTTACGATGA 
TCATCAGTAC 
GAGGTTTCCA 
ATACAAGTAC 
AAAAAAAAAA 
TTGATCTCAC 
TAATTTAGGG 
ATCCCACTGC 
TTGTATTTGT 
„ CCTGCTCTAT 
TGAAATCATT 
ATGAATGTTT 
TAAATATTGT 



11 
I 

TGCTCTGAGA 
GATCATCTTA 
GGATGGAATT 
AGCACGGTCT 
AGGCGGCCAT 
CTGTGCTGCT 
CAACTGATGA 
AAGATGGGAT 
TCCAAAGCGA 
CTACTGGCAC 
CCTTGAAGAT 
TGTCCATGGC 
AGGAAATGTC 
AATCAAATAT 
TACTTCTACT 
AAGGATGATC 
TGTTATTATT 
AAAATTGGAA 
ATAGAAATAA 
GGTATATGTA 
GTACAGTTTT 
GATTATTCTA 
TATGCATTAT 
TGCAATAAAT 



21 

I 

ATTTGTGAGC 
ATTTACTTAT 
TTTCATAACT 
GGCAAATACA 
CTCGCAACTT 
GGATGGATGG 
TTTGGAAAAA 
GCCTATTGCT 
ATTTTTAAAT 
ATTAGACTCA 
GACCCAGGTT 
TTTGTGGGAA 
ATGACCTTGA 
GTTGCAATGG 
GGAAATAAAA 
AAAACACACA 
AACATTTATT 
AATATAGGAA 
CAAGCGTTAA 
TATATGTACC 
GTATTATACT 
CAAAAACATG 
TTAAGCCTGT 
ATCCTTCGGA 



31 

I. 

AGCCCCTAAC 
TTCTCTTGCT 
CCATATGGCT 
AGCTCACCTA 
ACAAGCAGCT 
CTAAGGGCAG 
CTGGCATTAT 
ACAACCCACA 
CTCCAGGCTT 
AGTATGGTCA 
GCTTGGCTGA 
GATACTGTGG 
AGTTTCTAAG 
ATCCTGTATC 
ACTTTTTAGC 
GTGTTTATGT 
TATTATTTTT 
ACTTTAAACG 
CATTTTCATA 
TATATGTATT 
TTTTAAATCT 
ATTTTAAACA 
CTCTATTGrT 
ATTC 



41 51 

i I 
AGGCTGTTAC TTCACTACAA 
ATGGGAAGAC ACTCAAGGAT 
TGAACGAGCA GCCGGTGTGT 
CGCAGAAGCT AAGGCGGTGT 
AGAGGCAGCC AGAAAAATTG 
AGTTGGATAC CCCATTGTGA 
TGATTATGGA ATCCGTCTCA 
CGCAAAGGAG TGTGGTGGCG 
CCCAAATGAG TACGAAGATA 
GCGTATTCAC CTGAGTTTTT 
TTATGTTGAA ATATATGACA 
AGATGAGCTT CCAGATGACA 
TGATGCTTCA GTGACAGCTG 
CAAATCCAGT CAAGGAAAAA 
TGGAAGATTT AGCCACTTAT 
TGGAATCTTT TGGAACTCCT 
CTAAATGTGA AAGAAATACA 
AGAAAATGAA ACCTCTCATA 
TTTTTTTCTT TCAGTCATTT 
TGCATTTGAA ATTTTGGAAT 
TGAACTTTAT GAACATTTTC 
GCTGTAAAAT ATTCTATGAT 
GGAATTTCAG GTCATTTTCA 



Seq. ID NO: 507 Protein sequence 
Protein Accession #: NP_009046.1 

1 11 21 31 41 51 

I I.I I I I 

MIILIYLFLL LWEDTQGWGF KDGIFHNSIW LERAAGVYHR EARSGKYKLT YAEAKAVCEF 
EGGHLATYKQ LEAARKIGFH VCAAGWMAKG RVGYPIVKPG PNXXFGKTGI IDYGIRLNRS 
ERWDAYCYNP HAKECGGVFT DPKRIFKSPG FPNEYEDNQI CYWHIRLKYG QRIHLSFLDF 
DLEDDPGCLA DYVEIYDSYD DVHGFVGRYC GDELPDDIIS TGNVMTLKFL SDASVTAGGF 
QIKYVAMDPV SKSSQGKNTS TTSTGNKNFL AGRFSHL 



Seq ID NO: 508 DNA sequence 

Nucleic Acid Accession #t' NM_0 01044. 

Coding sequence: 129.. 1991 



ACCGCTCCGG 
AAAGCCCAGG 
GTGTGCCCAT 
CTAAGGAGCC 
ACGGAGTGCA 
AGGATCGGGA 
TGGACCTGGC 



11 
I 

AGCGGGAGGG 
CCCGGGCGGC 
GAGTAAGAGC 
CAATGCCGTG 
GCTCACCAGC 
GACCTGGGGC 
CAACGTCTGG 



21 
I 

GAGGCTTCGC 
CAGACCAAGA 
AAATGCTCCG 
GGCCCGAAGG 
TCCACCCTCA 
AAGAAGATCG 
CGGTTCCCCT 



31 
I 

GGAACGCTCT 
GGGAAGAAGC 
TGGGACTCAT 
AGGTGGAGCT 
CCAACCCGCG 
ACTTTCTCCT 
ACCTGTGCTA 



41 
I 

CGGCGCCAGG 
ACAGAATTCC 
GTCTTCCGTG 
CATCCTTGTC 
GCAGAGCCCC 
GTCCGTCATT 
CAAAAATGGT 



51 
i 

ACTCGCGTGC 
TCAACTCCCA 
GTGGCCCCGG 
AAGGAGCAGA 
GTGGAGGCCC 
GGCTTTGCTG 
GGCGGTGCCT 



360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1360 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 



378 



WO 02/086443 

TCCTGGTCCC CTACCTGCTC TTCATGGTCA TTGCTGGGAT GCCACTTTTC TACATGGAGC 480 

TGGCCCTCGG CCAGTTCAAC AGGGAAGGGG CCGCTGGTGT CTGGAAGATC TGCCCCATAC 540 

TGAAAGGTGT GGGCTTCACG GTCATCCTCA TCTCACTGTA TGTCGGCTTC TTCTACAACQ 600 

TCATCATOGC CTGGGOGCTG CACTATCTCT TCTCCTCCTT CACCACGGAG CTCCCCTGGA 660 

TCCACTGCAA CAACTCCTGG AACAGCCCCA ACTGCTCGGA TGCCCATCCT GGTGACTCCA 720 

GTGGAGACAG CTCGGGCCTC AACGACACTT TTGGGACCAC ACCTGCTGCC GAGTACTTTG 780 

AACGTGGCGT GCTGCACCTC CACCAGAGCC ATGGCATCGA CGACCTGGGG CCTCCGCGGT 840 

GGCAGCTCAC AGCCTGCCTG GTGCTGGTCA TCGTGCTGCT CTACTTCAGC CTCTGGAAGG 900 

GCGTGAAGAC CTCAGGGAAG GTGGTATGGA TCACAGCCAC CATGCCATAC GTGGTCCTCA 960 

CTGCCCTGCT CCTGCGTGGG GTCACCCTCC CTGGAGCCAT AGACGGCATC AGAGCATACC 1020 

TGAGCGTTGA CTTCTACCGG CTCTGCGAGG CGTCTGTTTG GATTGACGOG GCCACCCAGG 1080 

TGTGCTTCTC CCTGGGCGTG GGGTTCGGGG TGCTGATOGC CTTCTCCAGC TACAACAAGT 1140 

TCACCAACAA CTGCTACAGG GACGCGATTG TCACCACCTC CATCAACTCC CTGACGAGCT 1200 

TCTCCTCCGG CTTCGTCGTC TTCTCCTTCC TGGGGTACAT GGCACAGAAG CACAGTGTGC 1260 

CCATCGGGGA CGTGGCCAAG GACGGGCCAG GGCTGATCTT CATCATCTAC CCGGAAGCCA 1320 

TCGCCACGCT CCCTCTGTCC TCAGCCTGGG CCGTGGTCTT CTTCATCATG CTGCTCACCC 1380 

TGGGTATCGA CAGCGCCATG GGTGGTATGG AGTCAGTGAT CACCGGGCTC ATCGATGAGT 1440 

TCCAGCTGCT GCACAGACAC CGTGAGCTCT TCACGCTCTT CATCGTCCTG GCGACCTTCC 1500 

TCCTGTCCCT GTTCTGCGTC ACCAACGGTG GCATCTACGT CTTCACGCTC CTGGACCATT 1560 

TTGCAGCCGG CACGTCCATC CTCTTTGGAG TGCTCATCGA AGCCATCGGA GTGGCCTGGT 1620 

TCTATGGTGT TGGGCAGTTC AGCGACGACA TCCAGCAGAT GACCGGGCAG CGGCCCAGCC 1680 

TGTACTGGCG GCTGTGCTGG AAGCTGGTCA GCCCCTGCTT TCTCCTGTTC GTGGTOGTGG 1740 

TCAGCATTGT GACCTTCAGA CCCCCCCACT ACGGAGCCTA CATCTTCCCC GACTGGGCCA 1800 

ACGCGCTGGG CTGGGTCATC GCCACATCCT CCATGGCCAT GGTGCCCATC TATGCGGCCT 1860 

ACAAGTTCTG CAGCCTGCCT GGGTCCTTTC GAGAGAAACT GGCCTACGCC ATTGCACCCG 1920 

AGAAGGACCG TGAGCTGGTG GACAGAGGGG AGGTGCGCCA GTTCACGCTC CGCCACTGGC 1980 

TCAAGGTGTA GAGGGAGCAG AGACGAAGAC CCCAGGAAGT CATCCTGCAA TGGGAGAGAC 2040 

ACGAACAAAC CAAGGAAATC TAAGTTTCGA GAGAAAGGAG GGCAACTTCT ACTCTTCAAC 2100 

CTCTACTGAA AACACAAACA ACAAAGCAGA AGACTCCTCT CTTCTGACTG TTTACACCTT 2160 

TCCGTGCCGG GAGCGCACCT CGCCGTGTCT TGTGTTGCTG TAATAACGAC GTAGATCTGT 2220 

GCAGCGAGGT CCACCCCGTT GTTGTCCCTG CAGGGCAGAA AAACGTCTAA CTTCATGCTG 2280 

TCTGTGTGAG GCTCCCTCCC TCCCTGCTCC CTGCTCCCGG CTCTGAGGCT GCCCCAGGGG 2340 

CACTGTGTTC TCAGGCGGGG ATCACGATCC TTGTAGACGC ACCTGCTGAG AATCCCCGTG 2400 

CTCACAGTAG CTTCCTAGAC CATTTACTTT GCCCATATTA AAAAGCCAAG TGTCCTGCTT 2460 

GGTTTAGCTG TGCAGAAGGT GAAATGGAGG AAACCACAAA TTCATGCAAA GTCCTTTCCC 2520 

GATGCGTGGC TCCCAGCAGA GGCCGTAAAT TGAGCGTTCA GTTGACACAT TGCACACACA 2580 

GTCTGTTCAG AGGCATTGGA GGATGGGGGT CCTGGTATGT CTCACCAGGA AATTCTGTTT 2640 

ATGTTCTTGC AGCAGAGAGA AATAAAACTC CTTGAAACCA GCTCAGGCTA CTGCCACTCA 2700 

GGCAGCCTGT GGGTCCTTGT GGTGTAGGGA ACGGCCTGAG AGGAGCGTGT CCTATCCCCG 2760 

GACGCATGCA GGGCCCCCAC AGGAGCGTGT CCTATCCCCG GACGCATGCA GGGCCCCCAC 2820 

AGGAGCATGT CCTATCCCTG GACGCATGCA GGGCCCCCAC AGGAGCGTGT ACTACCCCAG 2880 

AACGCATGCA GGGCCCCCAC AGGAGCGTGT ACTACCCCAG GACGCATGCA GGGCCCCCAC 2940 

TGGAGCGTGT ACTACCCCAG GACGCATGCA GGGCCCCCAC AGGAGCGTGT CCTATCCCCG 3000 

GACCGGACGC ATGCAGGGCC CCCACAGGAG CGTGTACTAC CCCAGGACGC ATGCAGGGCC 3060 

CCCACAGGAG CGTGTACTAC CCCAGGATGC ATGCAGGGCC CCCACAGGAG CGTGTACTAC 3120 

CCCAGGACGC ATGCAGGGCC CCCATGCAGG CAGCCTGCAG ACCAACACTC TGCCTGGCCT 3180 

TGAGCCGTGA CCTCCAGGAA GGGACCCCAC TGGAATTTTA TTTCTCTCAG GTGCGTGCCA 3240 

CATCAATAAC AACAGTTTTT ATGTTTGCGA ATGGCTTTTT AAAATCATAT TTACCTGTGA 3300 

ATCAAAACAA ATTCAAGAAT GCAGTATCCG CGAGCCTGCT TGCTGATATT GCAGTTTTTG 3360 

TTTACAAGAA TAATTAGCAA TACTGAGTGA AGGATGTTGG CCAAAAGCTG CTTTCCATGG 3420 

CACACTGCCC TCTGCCACTG ACAGGAAAGT GGATGCCATA GTTTGAATTC ATGCCTCAAG 3480 

TCGGTGGGCC TGCCTACGTG CTGCCCGAGG GCAGGGGCCG TGCAGGGCCA GTCATGGCTG 3540 

TCCCCTGCAA GTGGACGTGG GCTCCAGGGA CTGGAGTGTA ATGCTCGGTG GGAGCCGTCA 3600 

GCCTGTGAAC TGCCAGGCAG CTGCAGTTAG CACAGAGGAT GGCTTCCCCA TTGCCTTCTG 3660 

GGGAGGGACA CAGAGGACGG CTTCCCCATC GCCTTCTGGC CGCTGCAGTC AGCAGAGAGA 3720 

GCGGCTTCCC CATTGCCTTC TGGGGAGGGA CACAGAGGAC AGTTTCCCCA TCGCCTTCTG 3780 

GTTGTTGAAG ACAGCACAGA GAGCGGCTTC CCCATCGCCT TCTGGGGAGG GGCTCCGTGT 3840 

AGCAACCCAG GTGTTGTCCG TGTCTGTTGA CCAATCTCTA TTCAGCATCG TGTGGGTCCC 3900 
TAAGCACAAT AAAAGACATC CACAATGGAA AAAAAAAAAG GAATTC 

Seq ID NO: 509 Protein sequence 
Protein Accession ft: NP_001035.1 

1 11 21 31 41 51 

I I I I I I 

MSKSKCSVGL MSSWAPAKE PNAVGPKEVB LILVKEQNGV QLTSSTLTNP RQSPVEAQDR 60 

ETWGKKIDFL LSVIGFAVDL ANVWRPPYLC YKNGGGAFLV PYLLFMVIAG MPLFYMELAL 120 

GQFNREGAAG VWKICPILXG VGFTVILISL YVGFFYNVII AWALHYLFSS FTTELPWIHC 180 

NNSWNSPNCS DAHPGDSSGD SSGIiNDTFGT TPAAEYFERG VLHLHQSHGI DDLGPPRWQL 240 

TACLVLVIVL LYFSLWKGVK TSGKWWITA TMPYWLTAL LLRGVTLPGA IDGIRAYLSV 300 

DFYRLCEASV WIDAATQVCP SLGVGFGVLI AFSSYNKFTN NCYRDAIVTT SINSLTSFSS 360 

GFWFSPLGY MAQKHSVPIG DVAKDGPGLI FIIYPEAIAT I/PI/SSAWAW FPIMLLTLGI 420 

DSAMGGMESV ITGLIDEFQL LHRHRELFTL FIVLATFLLS LFCVTNGGIY VFTLLDHFAA 480 

GTSILFGVLI EAIGVAWFYG VGQFSDDIQQ MTGQRPSL.YW RLCWKLVSPC FLLFWWSI 540 

VTFRPPHYGA YIFPDWANAL GWVIATSSMA MVPIYAAYKF CSLPGSPREK LAYAIAPEKD 600 
RELVDRGEVR QFTLRHWLKV 

Seq ID NO: 510 DNA sequence 

Nucleic Acid Accession #: NMJ)01216.1 

Coding sequence: 43., 1422 

1 11 21 31 41 51 

I I I I I I 

GCCCGTACAC ACCGTGTGCT GGGACACCCC ACAGTCAGCC GCATGGCTCC CCTGTGCCCC 60 

AGCCCCTGGC TCCCTCTGTT GATCCCGGCC CCTGCTCCAG GCCTCACTGT GCAACTGCTG 120 

CTGTCACTGC TGCTTCTGAT GCCTGTCCAT CCCCAGAGGT TGCCCCGGAT GCAGGAGGAT 180 

TCCCCCTTGG GAGGAGGCTC TTCTGGGGAA GATGACCCAC TGGGCGAGGA GGATCTGCCC 240 



379 



WO 02/086443 

AGTGAAGAGG ATTCACCCAG AGAGGA6GAT CCACCCGGAG AGGAGGATCT ACCTGGAGAG 300 

GAGGATCTAC CTGGAGAGGA GGATCTACCT GAAGTTAAGC CTAAATCAGA AGAAGAGGGC 360 

TCCCTGAAGT TAGAGGATCT ACCTACTGTT GAGGCTCCTG GAGATCCTCA AGAACCCCAG 420 

AATAATGCCC ACAGGGACAA AGAAGGGGAT GACCAGAGTC ATTGGCGCTA TGGAGGOGAC 480 

CCGCCCTGGC CCOGGGTGTC CCCAGCCTGC GCGGGCCGCT TCCAGTCCCC GGTGGATATC 540 

CGCCCCCAGC TCGOCGCCTT CTGCCCGGCC CTGCGCCCCC TGGAACTCCT GGGCTTCCAG 600 

CTCCCGCCGC TCCCAGAACT GCGCCTGCGC AACAATGGCC ACAGTGTGCA ACTGACCCTG 660 

CCTCCTGGGC TAGAGATGGC TCTGGGTCCC GGGCGGGAGT ACCGGGCTCT GCAGCTGCAT 720 

CTGCACTGGG GGGCTGCAGG TCGTCCGGGC TCGGAGCACA CTGTGGAAGG CCACCGTTTC 7 BO 

CCTGCCGAGA TCCACGTGGT TCACCTCAGC ACCGCCTTTG CCAGAGTTGA CGAGGCCTTG 840 

GGGOGCCOGG GAGGCCTGGC CGTGTTGGCC GCCTTTCTGG AGGAGGGCCC GGAAGAAAAC 900 

AGTGCCTATG AGCAGTTGCT GTCTCGCTTG GAAGAAATCG CTGAGGAAGG CTCAGAGACT 960 

CAGGTCCCAG GACTGGACAT ATCTGCACTC CTGCCCTCTG ACTTCAGCCG CTACTTCCAA 1020 

TATGAGGGGT CTCTGACTAC ACCGCCCTGT GCCCAGGGTG TCATCTGGAC TGTGTTTAAC 1080 

CAGACAGTGA TGCTGAGTGC TAAGCAGCTC CACACCCTCT CTGACACCCT GTGGGGACCT 1140 

GGTGACTCTC GGCTACAGCT GAACTTCCGA GCGACGCAGC CTTTGAATGG GCGAGTGATT 1200 

GAGGCCTCCT TCCCTGCTGG AGTGGACAGC AGTCCTCGGG CTGCTGAGCC AGTCCAGCTG 1260 

AATTCCTGCC TGGCTGCTGG TGACATCCTA GCCCTGGTTT TTGGCCTCCT TTTTGCTGTC 1320 

ACCAGCGTCG CGTTCCTTGT GCAGATGAGA AGGCAGCACA GAAGGGGAAC CAAAGGGGGT 1380 

GTGAGCTACC GCCCAGCAGA GGTAGCCGAG ACTGGAGCCT AGAGGCTGGA TCTTGGAGAA 1440 

TGTGAGAAGC CAGCCAGAGG CATCTGAGGG GGAGCCGGTA ACTGTCCTGT CCTGCTCATT 1500 
ATGCCACTTC CTTTTAACTG CCAAGAAATT TTTTAAAATA AATATTTATA AT 



Seq ID NO: 511 Protein sequence 
Protein Accession #: NP_001207.1 

1 11 21 31 41 51 

| I" I ' I II 

MAPLCPSPWL PIiLIPAPAPG LTVQLLLSLL LLMPVHPQRL PRMQEDSPLG GGSSGEDDPL 60 
GEEDLPSEED SPREEDPPGE EDLPGEEDLP GEEDLPEVKP KSEEEGSLKL EDLPTVEAPG 120 
DPQEPQNNAH RDKEGDDQSH WRYGGDPPWP RVSPACAGRF QSPVDIRPQL AAFCPALRPL 180 
ELLGFQLPPL PELRLRNNGH SVQLTLPPGL EMALGPGREY RALQLHLHWG AAGRPGSEHT 240 
VEGHRFPAEI KWHLSTAFA RVDEALGRPG GLAVLAAFLE EGPEENSAYE QLLSRLEEIA 300 
EEGSETQVPG LDISALLPSD FSRYFQYEGS LTTPPCAQGV IWTVFNQTVM LSAKQLHTLS 360 
DTLWGPGDSR LQLNFRATQP LNGRVIEASF PAGVDSSPRA AEPVQLNSCL AAGDILALVF 420 
GLLFAVTSVA FLVQMRRQHR RGTKGGVSYR PAEVAETGA 



Seq ID NO: 512 DNA sequence 

Nucleic Acid Accession ft: Eos sequence 

Coding sequence: 1..3978 

1 11 21 31 41 51 

111)11 

ATGGTGGGTG AAGGACCCTA CCTTATCTCA GATCTGGACC AGCGAGGCCG GCGGAGATCC 60 

TTTGCAGAAA GATATGACCC CAGCCTGAAG ACCATGATCC CAGTGCGACC CTGTGCAAGG 120 

TTAGCACCCA ACCCGGTGGA TGATGCCGGG CTACTCTCCT TOGCCACATT TTCCTGGCTC 180 

ACGCCGGTGA TGGTGAAAGG CTACCGGCAA AGGCTGACCG TAGACACCCT GCCCCCATTG 240 

TCGACATATG ACTCATCTGA CACCAATGCC AAAAGATTTC GAGTCCTTTG GGATGAAGAG 300 

GTAGCAAGGG TGGGTCCTGA GAAGGCCTCT CTGAGCCACG TGGTGTGGAA ATTCCAGAGG 360 

ACACGCGTGT TGATGGACAT CGTGGCCAAC ATCCTGTGCA TCATCATGGC AGCCATAGGG 420 

CCGACAGTTC TCATTCACCA AATCCTCCAG CAGACTGAGA GGACCTCTGG GAAAGTCTGG 480 

GTTGGCATTG GACTGTGCAT AGCCCTTTTT GCCACCGAGT TTACCAAAGT CTTCTTTTGG 540 

GCCCTTGCCT GGGCCATCAA CTACCGCACG GCCATCCGGT TGAAGGTGGC GCTCTCCACC 600 

TTGGTTTTTG AAAACCTAGT GTCCTTCAAG ACATTGACCC ACATCTCTGT TGGCGAGGTG 660 

CTCAATATAC TGTCAAGTGA TAGCTATTCT TTGTTTGAAG CTGCCTTGTT TTGTCCTTTG 720 

CCAGCCACCA TCCCGATCCT AATGGTCTTT TGTGCGGCGT ACGCCTTTTT CATTCTGGGG 7 BO 

CCCACAGCTC TCATOGGGAT ATCAGTGTAT GTCATATTCA TACCCGTCCA GATGTTTATQ 840 

GCCAAGCTCA ATTCAGCTTT CCGAAGGTCA GCAATTTTGG TGACAGACAA GCGAGTTCAG 900 

ACAATGAATG AGTTTCTGAC CTGCATCAGG CTGATCAAAA TGTATGCCTG GGAGAAATCT 960 

TTTACCAACA CTATCCAAGA TATAAGAAGG AGGGAAAGAA AATTACTGGA AAAAGCTGGA 1020 

TTTGTCCAAA GTGGAAACTC TGCCCTGGCC CCCATCGTGT CCACCATAGC CATCGTGCTG 1080 

ACATTATCCT GCCACATCCT CCTGAGACGC AAACTCACCG CACCCGTGGC ATTTAGTGTG 1140 

ATTGCCATGT TTAATGTAAT GAAGTTTTCC ATTGCAATCT TGCCCTTCTC CATCAAAGCA 1200 

ATGGCTGAAG CGAATGTCTC TCTAAGGAGA ATGAAGAAAA TTCTCATAGA TAAAAGCCCC 1260 

CCATCTTACA TCACCCAACC AGAAGACCCA GATACTGTCT TGCTTTTAGC AAATGCCACC 1320 

TTGACATGGG AGCATGAAGC CAGCAGGAAA AGTACCCCAA AGAAATTGCA GAACCAGAAA 1380 

AGGCATTTAT GCAAGAAACA GAGGTCAGAG GCATACAGTG AGAGGAGTCC ACCAGCCAAG 1440 

GGAGCCACTG GCCCAGAGGA GCAAAGTGAC AGCCTCAAAT CGGTTCTGCA CAGCATAAGC 1500 

TTTGTGGTGA GAAAGTTATG TCGTTATCCC GAAGCCCAGC TCCTGGCTTG GAGGTGGCCA 1560 

GCAGTGTTTG TTGGGAGAAT CATCAGAGGA TACAGGCCTC ATGGATTTTC TGCTAAAGAC 1620 

AAGGATGAAT CTAGAAGGCT TCTTACTTGG CCCCAAGAAG TGGATAGGAC TCAAAGGGCA 1680 

GCCAAATACC TGGGGAAGAT CTTGGGAATA TGTGGGAATG TGGGAAGTGG AAAGAGCTCC 1740 

CTCCTTGCAG CTCTCCTAGG ACAGATGCAG CTGCAGAAAG GGGTGGTGGC AGTCAATGGA 1800 

ACTTTGGCCT ACGTTTCACA GCAGGCATGG ATCTTTCATG GAAATGTGAG AGAAAACATA 1860 

CTCTTTGGAG AAAAGTATGA TCACCAAAGG TATCAGCACA CAGTCCGCGT CTGTGGCCTC 1920 

CAGAAGGACC TGAGCAACCT CCCCTATGGA GACCTGACTG AGATTGGGGA GCGGGGCCTC 1980 

AACCTCTCTG GGGGGCAGAG GCAGAGGATT AGCCTGGCCC GCGCTGTCTA CTCOGACCGT 2040 

CAGCTCTACC TGCTGGACGA CCCCCTGTCG GCCGTGGACG CCCACGTGGG GAAGCACGTC 2100 

TTTGAGGAGT GCATTAAGAA GACGCTCAGG GGAAAGACAG TCGTCCTGGT GACCCACCAG 2160 

CTACAGTTCT TAGAGTCTTG TGATGAAGTT ATTTTATTAG AAGATGGAGA GATTTGTGAA 2220 

AAGGGAACCC ACAAGGAGTT AATGGAGGAG AGAGGGCGCT ATGCAAAACT GATTCACAAC 2280 

CTGCGAGGAT TGCAGTTCAA GGATCCTGAA CACCTTTACA ATGCAGCAAT GGTGGAAGCC 2340 

TTCAAGGAGA GCCCTGCTGA GAGAGAGGAA GATGCTGGTA TAATCGGGTA CCTCCTTTCT 2400 

CTCTTCACTG TGTTCCTCTT CCTCCTGATG ATTGGCAGCG CTGCCTTCAG CAACTGGTGG 2460 

CTGGGTCTCT GGTTGGACAA GGGCTCACGG ATGACCTGTG GGCCCCAGGG CAACAGGACC 2520 

ATGTGTGAGG TCGGCGCGGT GCTGGCAGAC ATCGGTCAGC ATGTGTACCA GTGGGTGTAC 2580 

ACTGCAAGCA TGGTGTTCAT GCTGGTGTTT GGOGTCACCA AAGGCTTCGT CTTCACCAAG 2640 
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ACCACACTGA TGGCATCCTC CTCTCTGCAT GACACGGTGT TTGATAAGAT CTTAAAGAGC 2700 

CCAATGAGTT TCTTTGACAC GACTCCCACT GGCAGGCTAA TGAACCGTTT TTCCAAGGAT 2760 

ATGGACGAGC TGGATGTGAG GCTGCCGTTT CACGCAGAGA ACTTTCTGCA GCAGTTTTTT 2820 

ATGGTGGTGT TTATTCTCGT GATCTTGGCT GCTGTGTTTC CTGCTGTCCT TTTAGTCGTG 2BB0 

GCCAGCCTTG CTGTAGGCTT CTTCATTCTG TTACGCATTT TCCACAGAGG AGTCCAGGAG 2940 

CTCAAGAAGG TGGAGAATGT CAGCCGGTCA CCCTGGTTCA CCCACATCAC CTCCTCCATG 3000 

CAGGGCCTGG GCATCATTCA CGCCTATGGC AAGAAGGAGA GCTGCATCAC CTATACTTCA 3060 

TCCAAAGGCC TGTCATTGTC ATACATCATC CAGCTGAGCG GACTGCTCCA AGTGTGTGTG 3120 

CGAACGGGAA CAGAGACGCA AGCCAAATTC ACCTCCGTGG AGCTGCTCAG GGAATACATT 3X80 

TCGACCTGTG TTCCTGAATG CACTCATCCC CTCAAAGTGG GGACCTGTCC CAAGGACTGG 3240 

CCCAGCTGTG GGGAGATCAC CTTCAGAGAC TATCAGATGA GATACAGAGA CAACACCCCC 3300 

CTTGTTCTCG ACAGCCTGAA CTTGAACATA CAAAGTGGGC AGACAGTCGG GATTGTTGGA 3360 

AGAACAGGTT CCGGAAAGTC ATCGTTAGGA ATGGCTTTGT TTCGTCTGGT GGAGCCAGCC 3420 

AGTGGCACAA TCTTTATTGA TGAGGTGGAT ATCTGCATTC TCAGCTTGGA AGACCTCAGA 3480 

ACCAAGCTGA CTGTGATCCC ACAGGATCCT GTCCTGTTTG TAGGTACAGT AAGGTACAAC 3540 

TTGGATCCCT TTGAGAGTCA CACCGATGAG ATGCTCTGGC AGGTTCTGGA GAGAACATTC 3600 

ATGAGAGACA CAATAATGAA ACTCCCAGAA AAATTACAGG CAGAAGTCAC AGAAAATGGA 3660 

GAAAACTTCT CAGTAGGGGA ACGTCAGCTG CTTTGTGTGG CCCGAGCTCT TCTCCGTAAT 3720 

TCAAAGATCA TTCTCCTTGA TGAAGCCACC GCCTCTATGG ACTCCAAGAC TGACACCCTG 3780 

GTTCAGAACA CCATCAAAGA TGCCTTCAAG GGCTGCACTG TGCTGACCAT CGCCCACCGC 3840 

CTCAACACAG TTCTCAACTG CGATCAOGTC CTGGTTATGG AAAATGGGAA GGTGATTGAG 3900 

TTTGACAAGC CTGAAGTCCT TGCAGAGAAG CCAGATTCTG CATTTGCGAT GTTACTAGCA 3960 
GCAGAAGTCA GATTGTAG 

Seq ID NO: 513 Protein sequence 
Protein Accession #: Eos sequence 

1 11 * 21 31 " 41 51 

] I 1 I I I 

MVGEGPYLIS DLDQRGRRRS FAERYDPSLK TMIPVRPCAR LAPNPVDDAG LLSFATFSWL 60 

TPVMVKGYRQ RLTVDTLPPL STYDSSDTNA KRFRVLWDEE VARVGPEKAS LSHWWKFQR 120 

TRVLMDIVAN ILCIIMAAIG PTVLIHQILQ QTERTSGKVW VGIGLCIALF ATEFTKVFFW 180 

ALAWAINYRT AIRLKVALST LVFENLVSFK TLTHISVGEV LNILSSDSYS LFEAALFCPL 240 

PATIPILMVF CAAYAFFILG PTALIGISVY VIFIPVQMFM AKLNSAFRRS AILVTDKRVQ 300 

TMNEFLTCIR LIKMYAWEKS FTNTIQDIRR RERKLLEKAG FVQSGNSALA PIVSTIAIVL 360 

TLSCHILLRR KLTAPVAFSV IAMFNVMKFS IAILPFSIKA MAEANVSLRR MKKILIDKSP 420 

PSYITQPEDP DTVLLLANAT LTWEHEASRK STPKKLQNQK RHLCKKQRSE AYSERSPPAK 480 

GATGPEEQSD SLKSVLHSIS FWRKLCRYP EAQLLAWRWP AVFVGRIIRG YRPHGFSAKD 540 

KDESRRLLTW PQEVDRTQRA AKYLGKILGI CGNVGSGKSS LLAALLGQMQ LQKGWAVNG 60 0 

TLAYVSQQAW IFHGNVRENI LFGEKYDHQR YQHTVRVCGL QKDLSNLPYG DLTEIGERGL 660 

NLSGGQRQRI SLARAVYSDR QLYLLDDPLS AVDAHVGKHV FEECIKKTLR GKTWLVTHQ 720 

LQFLESCDEV ILL EDGE ICE KGTHKELMEE RGRYAKLIHN LRGLQFKDPE HLYNAAMVEA 780 

FKESPAEREE DAGIIGYLLS LFTVFLFLLM IGSAAFSNWW LGLWLDKGSR MTCGPQGNRT 840 

MCEVGAVLAD IGQHVYQWVY TASMVFMLVF GVTKGFVFTK TTLMASSSLH DTVFDKILKS 900 

PMSFFDTTPT GRLMNRFSKD MDELDVRLPF HAENFLQQFF MWFILVILA AVFPAVLLW 960 

ASLAVGFFIL LRIFHRGVQE LKKVENVSRS PWFTHITSSM QGLGIIHAYG KKESCITYTS 1020 

SKGLSLSYII QLSGLLQVCV RTGTETQAKF TSVELLREYI STCVPECTHP LKVGTCPKDW 1080 

PSCGEITFRD YQMRYRDNTP LVLDSLNLNI QSGQTVGIVG RTGSGKSSLG MALFRLVEPA 1140 

SGTIFIDEVD ICILSLEDLR TKLTVIPQDP VLFVGTVRYN LDPFESHTDE MLWQVLERTF 1200 

MRDTIMKLPE KLQAEVTENG ENFSVGERQL LCVARALLRN SKIILLDEAT ASMDSKTDTL 1260 

VQNTIKDAFK GCTVLTIAHR LNTVLNCDHV LVMENGKVIE FDKPEVLAEK PDSAFAMLLA 1320 
AEVRL 

Seq ID NO: 514 DNA sequence 
Nucleic Acid Accession #: Z31560 
Coding sequence: 1-966 

1 11 21 31 41 51 

I I I I I I 

CACAGCGCCC GCATGTACAA CATGATGGAG ACGGAGCTGA AGCCGCCGGG CCCGCAGCAA 60 

ACTTCGGGGG GCGGCGGCGG CAACTCCACC GCGGCGGCGG CCGGCGGCAA CCAGAAAAAC 120 

AGCCCGGACC GCGTCAAGCG GCCCATGAAT GCCTTCATGG TGTGGTCCCG CGGGCAGCGG 180 

CGCAAGATGG CCCAGGAGAA CCCCAAGATG CACAACTCGG AGATCAGCAA GCGCCTGGGC 240 

GCCGAGTGGA AACTTTTGTC GGAGACGGAG AAGCGGCCGT TCATCGACGA GGCTAAGCGG 300 

CTGCGAGCGC TGCACATGAA GGAGCACCCG GATTATAAAT ACCGGCCCCG GCGGAAAACC 360 

AAGACGCTCA TGAAGAAGGA TAAGTACACG CTGCCCGGCG GGCTGCTGGC CCCCGGCGGC 420 

AATAGCATGG CGAGCGGGGT CGGGGTGGGC GCCGGCCTGG GCGCGGGCGT GAACCAGCGC 480 

ATGGACAGTT ACGCGCACAT GAACGGCTGG AGCAACGGCA GCTACAGCAT GATGCAGGAC 540 

CAGCTGGGCT ACCCGCAGCA CCCGGGCCTC AATGCGCACG GCGCAGCGCA GATGCAGCCC 600 

ATGCACCGCT ACGACGTGAG CGCCCTGCAG TACAACTCCA TGACCAGCTC GCAGACCTAC 660 

ATGAACGGCT CGCCCACCTA CAGCATGTCC TACTCGCAGC AGGGCACCCC TGGCATGGCT 720 

CTTGGCTCCA TGGGTTCGGT GGTCAAGTCC GAGGCCAGCT CCAGCCCCCC TGTGGTTACC 780 

TCTTCCTCCC ACTCCAGGGC GCCCTGCCAG GCCGGGGACC TCCGGGACAT GATCAGCATG 840 

TATCTCCCCG GCGCCGAGGT GCCGGAACCC GCCGCCCCCA GCAGACTTCA CATGTCCCAG 900 

CACTACCAGA GCGGCCCGGT GCCCGGCACG GCCATTAACG GCACACTGCC CCTCTCACAC 960 

ATGTGAGGGC CGGACAGCGA ACTGGAGGGG GGAGAAATTT TCAAAGAAAA ACGAGGGAAA 1020 

TGGGAGGGGT GCAAAAGAGG AGAGTAAGAA ACAGCATGGA GAAAACCCGG TACGCTCAAA 1080 
AAAAA 

Seq ID NO: 51S Protein sequence 
Protein Accession #: CAA83435 

1 11 21 31 41 51 

I ! I I I I 

HSARMYNMME TELKPPGPQQ TSGGGGGNST AAAAGGNQKN SPDRVKRPMN AFMVWSRGQR 60 

RKMAQENPKM HNSEISKRLG AEWKLLSETE KRPFIDEAKR LRALHMKEHP DYKYRPRRKT 120 

KTLMKKDKYT LPGGLLAPGG NSMASGVGVG AGLGAGVNQR MDSYAHMNGW SNGSYSMMQD 180 



381 



WO 02/086443 

QLGYPQHPGL NAHGAAQKQP MHRYDVSALQ YNSMTSSQTY MNGSPTYSMS YSQQGTPGMA 240 

LGSMGSWKS BASSSPPWT SSSHSRAPCQ AGDLRDMISM YLPGAEVPEP AAPSRLHMSQ 300 
HYQSGPVPGT AINGTLPLSH M 

Seq ID NO: 516 DNA sequence 
Nucleic Acid Accession #: U91618 
Coding sequence: 2 9.. 541 

1 11 21 31 41 51 

I I I I I I 

CGGACTTGGC TTGTTAGAAG GCTGAAAGAT GATGGGAGGA ATGAAAATCC AGCTTGTATG 60 

CATGCTACTC CTGGCTTTCA GCTCCTGGAG TCTGTGCTCA GATTCAGAAG AGGAAATGAA 120 

AGCATTAGAA GCAGATTTCT TGACCAATAT GCATACATCA AAGATTAGTA AAGCACATGT 180 

TCCCTCTTGG AAGATGACTC TGCTAAATGT TTGCAGTCTT GTAAATAATT TGAACAGCCC 240 

AGCTGAGGAA ACAGGAGAAG TTCATGAAGA GGAGCTTGTT GCAAGAAGGA AACTTCCTAC 300 

TGCTTTAGAT GGCTTTAGCT TGGAAGCAAT GTTGACAATA TACCAGCTCC ACAAAATCTG 360 

TCACAGCAGG GCTTTTCAAC ACTGGGAGTT AATCCAGGAA GATATTCTTG ATACTGGAAA 420 

TGACAAAAAT GGAAAGGAAG AAGTCATAAA GAGAAAAATT CCTTATATTC TGAAACGGCA 480 

GCTGTATGAG AATAAACCCA GAAGACCCTA CATACTCAAA AGAGATTCTT ACTATTACTG 540 

AGAGAATAAA TCATTTATTT ACATGTGATT GTGATTCATC ATCCCTTAAT TAAATATCAA 600 

ATTATATTTG TGTGAAAATG TGACAAACAC ACTTATCTGT CTCTTCTACA ATTGTGGTTT 660 

ATTGAATGTG TTTTTCTGCA CTAATAGAAA TTAGACTAAG TGTTTTCAAA TAAATCTAAA 720 
TCTTCAAAAA AAAAAAAAAA AAATGGGGCC GCAATT 

Seq ID NO: 517 Protein sequence 
Protein Accession »: AAB50564 

1 11 21 31 41 51 

I I I I I I 

MMAGMKIQLV CMLLLAFSSW SLCSDSEEEM KALSADFLTN MHTSKISKAH VPSWKMTLLN 60 

VCSLVNNLNS PAEETGEVHE EELVARRKLP TALDGFSLEA MLTIYQLHKI CHSRAFQHWE 120 
LIQEDILDTG NDKNGKEEVI KRKIPYILKR QLYENKPRRP YILKRDSYYY 

Seq ID NO: 518 DNA sequence 

Nucleic Acid Accession fl: NMJJ06536.2 

Coding sequence: 10 9.. 2 940 

1 11 21 31 41 51 

I I I I I I 

ACCTAAAACC TTGCAAGTTC AGGAAGAAAC CATCTGCATC CATATTGAAA ACCTGACACA 60 

ATGTATGCAG CAGGCTCAGT GTGAGTGAAC TGGAGGCTTC TCTACAACAT GACCCAAAGG 120 

AGCATTGCAG GTCCTATTTG CAACCTGAAG TTTGTGACTC TCCTGGTTGC CTTAAGTTCA 180 

GAACTCCCAT TCCTGGGAGC TGGAGTACAG CTTCAAGACA ATGGGTATAA TGGATTGCTC 240 

ATTGCAATTA ATCCTCAGGT ACCTGAGAAT CAGAACCTCA TCTCAAACAT TAAGGAAATG 300 

ATAACTGAAG CTTCATTTTA CCTATTTAAT GCTACCAAGA GAAGAGTATT TTTCAGAAAT 360 

' ATAAAGATTT TAATACCTGC CACATGGAAA GCTAATAATA ACAGCAAAAT AAAACAAGAA 420 

TCATATGAAA AGGCAAATGT CATAGTGACT GACTGGTATG GGGCACATGG AGATGATCCA 480 

TACACCCTAC AATACAGAGG GTGTGGAAAA GAGGGAAAAT ACATTCATTT CACACCTAAT 540 

TTCCTACTGA ATGATAACTT AACAGCTGGC TACGGATCAC GAGGCCGAGT GTTTGTCCAT 600 

GAATGGGCCC ACCTCCGTTG GGGTGTGTTC GATGAGTATA ACAATGACAA ACCTTTCTAC 660 

ATAAATGGGC AAAATCAAAT TAAAGTGACA AGGTGTTCAT CTGACATCAC AGGCATTTTT 720 

GTGTGTGAAA AAGGTCCTTG CCCCCAAGAA AACTGTATTA TTAGTAAGCT TTTTAAAGAA 780 

GGATGCACCT TTATCTACAA TAGCACCCAA AATGCAACTG CATCAATAAT GTTCATGCAA 840 

AGTTTATCTT CTGTGGTTGA ATTTTGTAAT GCAAGTACCC ACAACCAAGA AGCACCAAAC 900 

CTACAGAACC AGATGTGCAG CCTCAGAAGT GCATGGGATG TAATCACAGA CTCTGCTGAC 960 

TTTCACCACA GCTTTCCCAT GAATGGGACT GAGCTTCCAC CTCCTCCCAC ATTCTCGCTT 1020 

GTACAGGCTG GTGACAAAGT GGTCTGTTTA GTGCTGGATG TGTCCAGCAA GATGGCAGAG 1080 

GCTGACAGAC TCCTTCAACT ACAACAAGCC GCAGAATTTT ATTTGATGCA GATTGTTGAA 1140 

ATTCATACCT TCGTGGGCAT TGCCAGTTTC GACAGCAAAG GAGAGATCAG AGCCCAGCTA 1200 

CACCAAATTA ACAGCAATGA TGAT CGAAAG TTGCTGGTTT CATATCTGCC CACCACTGTA 1260 

TCAGCTAAAA CAGACATCAG CATTTGTTCA GGGCTTAAGA AAGGATTTGA GGTGGTTGAA 1320 

AAACTGAATG GAAAAGCTTA TGGCTCTGTG ATGATATTAG TGACCAGCGG AGATGATAAG 1380 

CTTCTTGGCA ATTGCTTACC CACTGTGCTC AGCAGTGGTT CAACAATTCA CTCCATTGCC 1440 

CTGGGTTCAT CTGCAGCCCC AAATCTGGAG GAATTATCAC GTCTTACAGG AGGTTTAAAG 1500 

TTCTTTGTTC CAGATATATC AAACTCCAAT AGCATGATTG ATGCTTTCAG TAGAATTTCC 1560 

TCTGGAACTG GAGACATTTT CCAGCAACAT ATTCAGCTTG AAAGTACAGG TGAAAATGTC 1620 

AAACCTCACC ATCAATTGAA AAACACAGTG ACTGTGGATA ATACTGTGGG CAACGACACT 1680 

ATGTTTCTAG TTACGTGGCA GGCCAGTGGT CCTCCTGAGA TTATATTATT TGATCCTGAT 1740 

GGACGAAAAT ACTACACAAA TAATTTTATC ACCAATCTAA CTTTTCGGAC AGCTAGTCTT 1800 

TGGATTCCAG GAACAGCTAA GCCTGGGCAC TGGACTTACA CCCTGAACAA TACCCATCAT 1860 

TCTCTGCAAG CCCTGAAAGT GACAGTGACC TCTCGCGCCT CCAACTCAGC TGTGCCCCCA 1920 

GCCACTGTGG AAGCCTTTGT GGAAAGAGAC AGCCTCCATT TTCCTCATCC TGTGATGATT 1980 

TATGCCAATG TGAAACAGGG ATTTTATCCC ATTCTTAATG CCACTGTCAC TGCCACAGTT 2040 

GAGCCAGAGA CTGGAGATCC TGTTACGCTG AGACTCCTTG ATGATGGAGC AGGTGCTGAT 2100 

GTTATAAAAA ATGATGGAAT TTACTCGAGG TATTTTTTCT CCTTTGCTGC AAATGGTAGA 2160 

TATAGCTTGA AAGTGCATGT CAATCACTCT CCCAGCATAA GCACCCCAGC CCACTCTATT 2220 

CCAGGGAGTC ATGCTATGTA TGTACCAGGT TACACAGCAA ACGGTAATAT TCAGATGAAT 2280 

GCTCCAAGGA AATCAGTAGG CAGAAATGAG GAGGAGCGAA AGTGGGGCTT TAGCCGAGTC 2340 

AGCTCAGGAG GCTCCTTTTC AGTGCTGGGA GTTCCAGCTG GCCCCCACCC TGATGTGTTT 2400 

CCACCATGCA AAATTATTGA CCTGGAAGCT GTAAAAGTAG AAGAGGAATT GACCCTATCT 2460 

TGGACAGCAC CTGGAGAAGA CTTTGATCAG GGCCAGGCTA CAAGCTATGA AATAAGAATG 2520 

AGTAAAAGTC TACAGAATAT CCAAGATGAC TTTAACAATG CTATTTTAGT AAATACATCA 2580 

AAGCGAAATC CTCAGCAAGC TGGCATCAGG GAGATATTTA CGTTCTCACC CCAGATTTCC 2640 

ACGAATGGAC CTGAACATCA GCCAAATGGA GAAACACATG AAAGCCACAG AATTTATGTT 2700 

GCAATACGAG CAATGGATAG GAACTCCTTA CAGTCTGCTG TATCTAACAT TGCCCAGGCG 2760 

CCTCTGTTTA TTCCCCCCAA TTCTGATCCT GTACCTGCCA GAGATTATCT TATATTGAAA 2820 

GGAGTTTTAA CAGCAATGGG TTTGATAGGA ATCATTTGCC TTATTATAGT TGTGACACAT 2880 
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CATACTTTAA GCAGGAAAAA GAGAGCAGAC AAGAAAGAQA ATGGAACAAA ATTATTATAA 2940 

ATAAATATCC AAAGTGTCTT CCTTCTTAGA TATAAGACCC ATGGCCTTOG ACTAC AAAAA 3000 

CATACTAACA AAGTCAAATT AACATCAAAA CTGTATTAAA ATGCATTGAG TTTTTGTACA 3060 

ATACAGATAA GATTTTTACA TGGTAGATCA ACAATTCTTT TTGGGGGTAG ATTAGAAAAC 3120 

CCTTACACTT TGGCTATGAA CAAATAATAA AAATTATTCT TTAAAGTAAT GTCTTTAAAG 3180 

GCAAAGGGAA GGGTAAAGTC GGACCAGTGT CAAGGAAAGT TTGTTTTATT GAGGTGGAAA 3240 

AATAGCCCCA AGCAGAGAAA AGGAGGGTAG GTCTGCATTA TAACTGTCTG TGTGAAGCAA 3300 

TCATTTAGTT ACTTTGATTA ATTTTTCTTT TCTCCTTATC TGTGCAGTAC AGGTTGCTTG 3360 

TTTACATGAA GATCATGCTA TATTTTATAT ATGTAGCCCC TAATGCAAAG CTCTTT ACCT 3420 

CTTGCTATTT TGTTATATAT ATTTCAGATG ACATCTCCCT GCTAATGCTC AGAGATCTTT 3480 

TTTCACTGTA AQAGGTAACC TTTAACAATA TGGGTATTAC CTTTGTCTCT TCATACCGGT 3540 

TTTATGACAA AGGTCTATTG AATTTATTTG TNTGTAAGTT TCTACTCCCA TCAAAGCAGC 3600 

TTTCTAAGTT TATTGCCTTG GGTTATTATG GAATGATAGT TATAGCCCCN TATAATGCCT 3660 
TACCTAGGAA A 

Seq ID NO: 519 Protein sequence 
Protein Accession 8: NPJ>06527.1 

1 11 21 31 41 51 

I I I I I I 

MTQRSIAGPI CNLKFVTLLV ALSSELPFLG AGVQLQDNGY NGLLIAINPQ VPENQNLISN 60 

IKEMITEASF YLFNATKRRV FPRNIKILIP ATWKANNNSK IKQESYEKAN VIVTDWYGAH 120 

GDDPYTLQYR GCGKEGKYIH FTPNFLLNDN LTAGYGSRGR VFVHEWAHLR WGVFDEYNND 180 

KPFYINGQNQ IKVTRCSSDI TGIFVCEKGP CPQENCIISK LPKEGCTPIY NSTQNATASI 240 

MFMQSLSSW EFCNASTHNQ EAPNLQNQMC SLRSAWDVIT DSADFHHSFP MNGTELPPPP 300 

TFSLVQAGDK WCLVLDVSS KMAEADRLLQ LQQAAEFYLM QIVEIHTFVG IASFDSKGEI 360 

RAQLHQINSN DDRKLLVSYL PTTVSAKTDI SICSGLKKGF EWEKLNGKA YGSVMILVTS 420 

GDDKLLGNCL PTVLSSGSTI HSIALGSSAA PNLEELSRLT GGLKPFVPDI SNSNSMIDAF 480 

SRISSGTGDI FQQHIQLEST GENVKPHHQL KNTVTVDNTV GNDTMFLVTW QASGPPEIIL 540 

FDPDGRKYYT NNFITNLTFR TASLWIPGTA KPGHWTYTLN NTHHSLQALK VTVTSRASNS 600 

AVPPATVEAF VERDSLHFPH PVMIYANVKQ GFYPILNATV TATVEPETGD PVTLRLLDDG 660 

AGADVIKNDG IYSRYFFSFA ANGRYSLKVH VNHSPSISTP AHS IPG SHAM YVPGYTANGN 720 

IQMNAPRKSV GRNEEERKWG FSRVSSGGSF SVLGVPAGPH PDVFPPCKII DLEAVKVEEE 780 

LTLSWTAPGE DFDQGQATSY EIRMSKSLQN IQDDFNNAIL VNTSKRNPQQ AGIREIFTFS 840 

PQISTNGPEH QPNGETHESH RIYVAIRAMD RNSLQSAVSN IAQAPLFIPP NSDPVPARDY 900 
LILKGVLTAM GLIGI ICLII WTHHTLSRK KRADKKENGT KLL 

Seq ID NO: 520 DNA sequence 

Nucleic Acid Accession #: NM_000228.1 

Coding sequence: 82.. 3600 

1 11 21 31 41 51 

I I I I I I 

GCTTTCAGGC GATCTGGAGA AAGAACGGCA GAACACACAG CAAGGAAAGG TCCTTTCTGG 60 

GGATCACCCC ATTGGCTGAA GATGAGACCA TTCTTCCTCT TGTGTTTTGC CCTGCCTGGC 120 

CTCCTGCATG CCCAACAAGC CTGCTCCCGT GGGGCCTGCT ATCCACCTGT TGGGGACCTG 180 

CTTGTTGGGA GGACCCGGTT TCTCCGAGCT TCATCTACCT GTGGACTGAC CAAGCCTGAG 240 

ACCTACTGCA CCCAGTATGG CGAGTGGCAG ATGAAATGCT GCAAGTGTGA CTCCAGGCAG 300 

CCTCACAACT ACTACAGTCA CCGAGTAGAG AATGTGGCTT CATCCTCCGG CCCCATGCGC 360 

TGGTGGCAGT CCCAGAATGA TGTGAACCCT GTCTCTCTGC AGCTGGACCT GGACAGGAGA 420 

TTCCAGCTTC AAGAAGTCAT GATGGAGTTC CAGGGGCCCA TGCCCGCCGG CATGCTGATT 480 

GAGCGCTCCT CAGACTTCGG TAAGACCTGG CGAGTGTACC AGTACCTGGC TGCCGACTGC 540 

ACCTCCACCT TCCCTCGGGT CCGCCAGGGT CGGCCTCAGA GCTGGCAGGA TGTTCGGTGC 600 

CAGTCCCTGC CTCAGAGGCC TAATGCACGC CTAAATGGGG GGAAGGTCCA ACTTAACCTT 660 

ATGGATTTAG TGTCTGGGAT TCCAGCAACT CAAAGTCAAA AAATTCAAGA GGTGGGGGAG 720 

ATCACAAACT TGAGAGTCAA TTTCACCAGG CTGGCCCCTG TGCCCCAAAG GGGCTACCAC 780 

CCTCCCAGCG CCTACTATGC TGTGTCCCAG CTCCGTCTGC AGGGGAGCTG CTTCTGTCAC 840 

GGCCATGCTG ATCGCTGCGC ACCCAAGCCT GGGGCCTCTG CAGGCCCCTC CACCGCTGTG 900 

CAGGTCCACG ATGTCTGTGT CTGCCAGCAC AACACTGCCG GCCCAAATTG TGAGCGCTGT 960 

GCACCCTTCT ACAACAACCG GCCCTGGAGA CCGGCGGAGG GCCAGGACGC CCATGAATGC 1020 

CAAAGGTGCG ACTGCAATGG GCACTCAGAG ACATGTCACT TTGACCCCGC TGTGTTTGCC 1080 

GCCAGCCAGG GGGCATATGG AGGTGTGTGT GACAATTGCC GGGACCACAC CGAAGGCAAG 1140 

AACTGTGAGC GGTGTCAGCT GCACTATTTC CGGAACCGGC GCCCGGGAGC TTCCATTCAG 1200 

GAGACCTGCA TCTCCTGCGA GTGTGATCCG GATGGGGCAG TGCCAGGGGC TCCCTGTGAC 1260 

CCAGTGACCG GGCAGTGTGT GTGCAAGGAG CATGTGCAGG GAGAGCGCTG TGACCTATGC 1320 

AAGCCGGGCT TCACTGGACT CACCTACGCC AACCCGCAGG GCTGCCACCG CTGTGACTGC 1380 

AACATCCTGG GGTCCCGGAG GGACATGCCG TGTGACGAGG AGAGTGGGCG CTGCCTTTGT 1440 

CTGCCCAACG TGGTGGGTCC CAAATGTGAC CAGTGTGCTC CCTACCACTG GAAGCTGGCC 1500 

AGTGGCCAGG GCTGTGAACC GTGTGCCTGC GACCCGCACA ACTCCCCTCA GCCCACAGTG 1560 

CAACCAGTTC ACAGGGCAGT GCCCTGTCGG GAAGGCTTTG GTGGCCTGAT GTGCAGCGCT 1620 

GCAGCCATCC GCCAGTGTCC AGACCGGACC TATGGAGACG TGGCCACAGG ATGCCGAGCC 1680 

TGTGACTGTG ATTTCCGGGG AACAGAGGGC CCGGGCTGCG ACAAGGCATC AGGCCGCTGC 1740 

CTCTGCCGCC CTGGCTTGAC CGGGCCCCGC TGTGACCAGT GCCAGCGAGG CTACTGCAAT 1800 

CGCTACCCGG TGTGCGTGGC CTGCCACCCT TGCTTCCAGA CCTATGATGC GGACCTCCGG 1860 

GAGCAGGCCC TGCGCTTTGG TAGACTCCGC AATGCCACCG CCAGCCTGTG GTCAGGGCCT 1920 

GGGCTGGAGG ACCGTGGCCT GGCCTCCCGG ATCCTAGATG CAAAGAGTAA GATTGAGCAG 1980 

ATCCGAGCAG TTCTCAGCAG CCCCGCAGTC ACAGAGCAGG AGGTGGCTCA GGTGGCCAGT 2040 

GCCATCCTCT CCCTCAGGCG AACTCTCCAG GGCCTGCAGC TGGATCTGCC CCTGGAGGAG 2100 

GAGACGTTGT CCCTTCCGAG AGACCTGGAG AGTCTTGACA GAAGCTTCAA TGGTCTCCTT 2160 

ACTATGTATC AGAGGAAGAG GGAGCAGTTT GAAAAAATAA GCAGTGCTGA TCCTTCAGGA 2220 

GCCTTCCGGA TGCTGAGCAC AGCCTACGAG CAGTCAGCCC AGGCTGCTCA GCAGGTCTCC 2280 

GACAGCTCGC GCCTTTTGGA CCAGCTCAGG GACAGCCGGA GAGAGGCAGA GAGGCTGGTG 2340 

CGGCAGGCGG GAGGAGGAGG AGGCACCGGC AGCCCCAAGC TTGTGGCCCT GAGGCTGGAG 2400 

ATGTCTTCGT TGCCTGACCT GACACCCACC TTCAACAAGC TCTGTGGCAA CTCCAGGCAG 2460 

ATGGCTTGCA CCCCAATATC ATGCCCTGGT GAGCTATGTC CCCAAGACAA TGGCACAGCC 2520 

TGTGGCTCCC GCTGCAGGGG TGTCCTTCCC AGGGCCGGTG GGGCCTTCTT GATGGCGGGG 2580 

CAGGTGGCTG AGCAGCTGCG GGGCTTCAAT GCCCAGCTCC AGCGGACCAG GCAGATGATT 2640 
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AGGGCAGCCG AGGAATCTGC CTCACAGATT CAATCCAGTG CCCAGCGCTT GGAGACCCAG 2700 

GTGAGCGCCA GCCGCTCCCA GATGGAGGAA GATGTCAGAC GCACACGGCT CCTAATCCAG 2760 

CAGGTCCGGG ACTTCCTAAC AGACCCCGAC ACTGATGCAG CCACTATCCA GGAGGTCAGC 2820 

GAGGCOGTGC TGGCCCTGTG GCTGCCCACA GACTCAGCTA CTGTTCTGCA GAAGATGAAT 2880 

GAGATCCAGG CCATTGCAGC CAGGCTCCCC AACGTGGACT TGGTGCTGTC CCAGACCAAG 2940 

CAGGACATTG OGCGTGCCCG CCGGTTGCAG GCTGAGGCTG AGGAAGCCAG GAGCCGAGCC 3000 

- CATGCAGTGG AGGGCCAGGT GGAAGATGTG GTTGGGAACC TGCGGCAGGG GACAGTGGCA 3060 

CTGCAGGAAG CTCAGGACAC CATGCAAGGC ACCAGCCGCT CCCTTCGGCT TATCCAGGAC 3120 

AGGGTTGCTG AGGTTCAGCA GGTACTGCGG CCAGCAGAAA AGCTGGTGAC AAGCATGACC 3180 

AAGCAGCTGG GTGACTTCTG GACACGGATG GAGGAGCTCC GCCACCAAGC CCGGCAGCAG 3240 

GGGGCAGAGG CAGTCCAGGC CCAGCAGCTT GCGGAAGGTG CCAGCGAGCA GGCATTGAGT 3300 

GCCCAAGAGG GATTTGAGAG AATAAAACAA AAGTATGCTG AGTTGAAGGA CCGGTTGGGT 3360 

CAGAGTTCCA TGCTGGGTGA GCAGGGTGCC CGGATCCAGA GTGTGAAGAC AGAGGCAGAG 3420 

GAGCTGTTTG GGGAGACCAT GGAGATGATG GACAGGATGA AAGACATGGA GTTGGAGCTG 3480 

CTGCGGGGCA GCCAGGCCAT CATGCTGCGC TCGGOGGACC TGACAGGACT GGAGAAGCGT 3540 

GTGGAGCAGA TCCGTGACCA CATCAATGGG CGCGTGCTCT ACTATGCCAC CTGCAAGTGA 3600 

TGCTACAGCT TCCAGCCCGT TGCCCCACTC ATCTGCCGCC TTTGCTTTTG GTTGGGGGCA 3660 

GATTGGGTTG GAATGCTTTC CATCTCCAGG AGACTTTCAT GCAGCCTAAA GTACAGCCTG 3720 

GACCACCCCT GGTGTGTAGC TAGTAAGATT ACCCTGAGCT GCAGCTGAGC CTGAGCCAAT 3780 

GGGACAGTTA CACTTGACAG ACAAAGATGG TGGAGATTGG CATGCCATTG AAACTAAGAG 3840 

CTCTCAAGTC AAGGAAGCTG GGCTGGGCAG TATCCCCCGC CTTTAGTTCT CCACTGGGGA 3900 

GGAATCCTGG ACCAAGCACA AAAACTTAAC AAAAGTGATG TAAAAATGAA AAGCCAAATA 3960 
AAAATCTTTG G 

Seq ID NO: 521 Protein sequence 
Protein Accession ft: NP_000219.1 

1 11 21 31 41 51 

I I I I I I 

MRPFFLLCFA LPGLLHAQQA CSRGACYPPV GDLLVGRTRF LRASSTCGLT KPETYCTQYG 60 

EWQMKCCKCD SRQPHNYYSH RVENVASSSG PMRWWQSQND VNPVSLQLDL DRRFQLQEVM 120 

MEFQGPMPAG MLIERSSDFG KTWRVYQYLA ADCTSTFPRV RQGRPQSWQD VRCQSLPQRP 180 

NARLNGGKVQ LNLMDLVSGI PATQSQKIQE VGEITNLRVN FTRLAPVPQR GYHPPSAYYA 240 

VSQLRLQGSC FCHGHADRCA PKPGASAGPS TAVQVHDVCV CQHNTAGPNC ERCAPFYNNR 300 

PWRPAEGQDA HECQRCDCNG HSETCHFDPA VFAASQGAYG GVCDNCRDHT EGKNCERCQL 360 

HYFRNRRPGA SIQETCISCE CDPDGAVPGA PCDPVTGQCV CKEHVQGERC DLCKPGFTGL 420 

TYANPQGCHR CDCNILGSRR DMPCDEESGR CLCLPNWGP KCDQCAPYHW KIASGQGCEP 480 

CACDPHNSPQ PTVQPVHRAV PCREGFGGLM CSAAAIRQCP DRTYGDVATG CRACDCDFRG 540 

TEGPGCDKAS GRCLCRPGLT GPRCDQCQRG YCNRYPVCVA CHPCFQTYDA DLREQALRFG 600 

RLRNATASLW SGPGLEDRGL ASRILDAKSK IEQIRAVLSS PAVTEQEVAQ VASAILSLRR 660 

TLQGLQLDLP LEEETLSLPR DLESLDRSFN GLLTMYQRKR EQFEKISSAD PSGAFRMLST 720 

AYEQSAQAAQ QVSDSSRLLD QLRDSRREAE RLVRQAGGGG GTGSPKLVAL RLENSSLPDL 780 

TPTFNKLCGN SRQMACTPIS CPGELCPQDN GTACGSRCRG VLPRAGGAFL MAGQVAEQLR 840 

GFNAQLQRTR QMIRAAEESA SQIQSSAQRL ETQVSASRSQ MEEDVRRTRL LIQQVRDFLT 900 

DPDTDAATIQ EVSEAVLALW LPTDSATVLQ KMNEIQAIAA RLPNVDLVLS QTKQDIARAR 960 

RLQAEAEEAR SRAHAVEGQV EDWGNLRQG TVALQEAQDT MQGTSRSLRL IQDRVAEVQQ 1020 

VLRPAEKLVT SMTKQLGDFW TRMEELRHQA RQQGAEAVQA QQLAEGASEQ ALSAQEGFER 1080 

IKQKYAELKD RLGQSSMLGE QGARIQSVKT EAEELFGETM EMMDRMKDME LELLRGSQAI 1140 
MLRSADLTGL EKRVEQIRDH INGRVLYYAT CK 

Seq ID NO: 522 DNA sequence 
Nucleic Acid Accession #: NM_0 01 944.1 
Coding sequence: 84.. 3083 

1 11 21 31 41 51 

I I I I I I 

TTTTCTTAGA CATTAACTGC AGACGGCTGG CAGGATAGAA GCAGCGGCTC ACTTGGACTT 
TTTCACCAGG GAAATCAGAG ACAATGATGG GGCTCTTCCC CAGAACTACA GGGGCTCTGG 
CCATCTTCGT GGTGGTCATA TTGGTTCATG GAGAATTGCG AATAGAGACT AAAGGTCAAT 
ATGATGAAGA AGAGATGACT ATGCAACAAG CTAAAAGAAG GCAAAAACGT GAATGGGTGA 
AATTTGCCAA ACCCTGCAGA GAAGGAGAAG ATAACTCAAA AAGAAACCCA ATTGCCAAGA 
TTACTTCAGA TTACCAAGCA ACCCAGAAAA TCACCTACCG AATCTCTGGA GTGGGAATCG 
ATCAGCCGCC TTTTGGAATC TTTGTTGTTG ACAAAAACAC TGGAGATATT AACATAACAG 
CTATAGTCGA CCGGGAGGAA ACTCCAAGCT TCCTGATCAC ATGTCGGGCT CTAAATGCCC 
AAGGACTAGA TGTAGAGAAA CCACTTATAC TAACGGTTAA AATTTTGGAT ATTAATGATA 
ATCCTCCAGT ATTTTCACAA CAAATTTTCA TGGGTGAAAT TGAAGAAAAT AGTGCCTCAA 
ACTCACTGGT GATGATACTA AATGCCACAG ATGCAGATGA ACCAAACCAC TTGAATTCTA 
AAATTGCCTT CAAAATTGTC TCTCAGGAAC CAGCAGGCAC ACCCATGTTC CTCCTAAGCA 
GAAACACTGG GGAAGTCCGT ACTTTGACCA ATTCTCTTGA CCGAGAGCAA GCTAGCAGCT 
ATCGTCTGGT TGTGAGTGGT GCAGACAAAG ATGGAGAAGG ACTATCAACT CAATGTGAAT 
GTAATATTAA AGTGAAAGAT GTCAACGATA ACTTCCCAAT GTTTAGAGAC TCTCAGTATT 
CAGCACGTAT TGAAGAAAAT ATTTTAAGTT CTGAATTACT TCGATTTCAA GTAACAGATT 
TGGATGAAGA GTACACAGAT AATTGGCTTG CAGTATATTT CTTTACCTCT GGGAATGAAG 
GAAATTGGTT TGAAATACAA ACTGATCCTA GAACTAATGA AGGCATCCTG AAAGTGGTGA 
AGGCTCTAGA TTATGAACAA CTACAAAGCG TGAAACTTAG TATTGCTGTC AAAAACAAAG 
CTGAATTTCA CCAATCAGTT ATCTCTOGAT ACCGAGTTCA GTCAACCCCA GTCACAATTC 
AGGTAATAAA TGTAAGAGAA GGAATTGCAT TCCGTCCTGC TTCCAAGACA TTTACTGTGC 
AAAAAGGCAT AAGTAGCAAA AAATTGGTGG ATTATATCCT GGGAACATAT CAAGCCATCG 
ATGAGGACAC TAACAAAGCT GCCTCAAATG TCAAATATGT CATGGGAOGT AACGATGGTG 
GATACCTAAT GATTGATTCA AAAACTGCTG AAATCAAATT TGTCAAAAAT ATGAACCGAG 
ATTCTACTTT CATAGTTAAC AAAACAATCA CAGCTGAGGT TCTGGCCATA GATGAATACA 
CGGGTAAAAC TTCTACAGGC ACGGTATATG TTAGAGTACC CGATTTCAAT GACAATTGTC 
CAACAGCTGT CCTCGAAAAA GATGCAGTTT GCAGTTCTTC ACCTTCCGTG GTTGTCTCCG 
CTAGAACACT GAATAATAGA TACACTGGCC CCTATACATT TGCACTGGAA GATCAACCTG 
TAAAGTTGCC TGCCGTATGG AGTATCACAA CCCTCAATGC TACCTCGGCC CTCCTCAGAG 
CCCAGGAACA GATACCTCCT GGAGTATACC ACATCTCCCT GGTACTTACA GACAGTCAGA 
ACAATCGGTG TGAGATGCCA CGCAGCTTGA CACTGGAAGT CTGTCAGTGT GACAACAGGG 
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GCATCTGTGG AACTTCTTAC CCAACCACAA GCCCTGGGAC CAGGTATGGC AGGCCGCACT 1920 

CAGGGAQGCT GGGGCCTGCC GCCATCGGCC TGCTQCTCCT TGGTCTCCTO CTGCTGCTGT 1980 

TGGCCCCCCT TCTGCTGTTG ACCTGTGACT GTGGGGCAGG TTCTACTGGG GGAGTGACAG 2040 

GTGGTTTTAT CCCAGTTCCT GATGGCTCAG AAGGAACAAT TCATCAGTGG GGAATTGAAG 2100 

GAGCCCATCC TGAAGACAAG GAAATCACAA ATATTTGTGT GCCTCCTGTA ACAGCCAATG 2160 

GAGCCGATTT CATGGAAAGT TCTGAAGTTT GTACAAATAC GTATGCCAGA GGCACAGCGG 2220 

TGGAAGGCAC TTCAGGAATG GAAATGACCA CTAAGCTTGG AGCAGCCACT GAATCTGGAG 2280 

GTGCTGCAGG CTTTGCAACA GGGACAGTGT CAGGAGCTGC TTCAGGATTC GGAGCAGCCA 2340 

CTGGAGTTGG CATCTGTTCC TCAGGGCAGT CTGGAACCAT GAGAACAAGG CATTCCACTG 2400 

GAGGAACCAA TAAGGACTAC GCTGATGGGG CGATAAGCAT GAATTTTCTG GACTCCTACT 2460 

TTTCTCAGAA AGCATTTGCC TGTGCGGAGG AAGACGATGG CCAGGAAGCA AATGACTGCT 2S20 

TGTTGATCTA TGATAATGAA GGCGCAGATG CCACTGGTTC TCCTGTGGGC TCCGTGGGTT 2580 

GTTGCAGTTT TATTGCTGAT GACCTGGATG ACAGCTTCTT GGACTCACTT GGACCCAAAT 2640 

TTAAAAAACT TGCAGAGATA AGCCTTGGTG TTGATGGTGA AGGCAAAGAA GTTCAGCCAC 2700 

CCTCTAAAGA CAGCGGTTAT GGGATTGAAT CCTGTGGCCA TCCCATAGAA GTCCAGCAGA 2760 

CAGGATTTGT TAAGTGCCAG ACTTTGTCAG GAAGTCAAGG AGCTTCTGCT TTGTCCGCCT 2820 

CTGGGTCTGT CCAGCCAGCT GTTTCCATCC CTGACCCTCT GCAGCATGGT AACTATTTAG 2880 

TAACGGAGAC TTACTCGGCT TCTGGTTCCC TCGTGCAACC TTCCACTGCA GGCTTTGATC 2940 

CACTTCTCAC ACAAAATGTG ATAGTGACAG AAAGGGTGAT CTGTCCCATT TCCAGTGTTC 3000 

CTGGCAACCT AGCTGGCCCA ACGCAGCTAC GAGGGTCACA TACTATGCTC TGTACAGAGG 3060 

ATCCTTGCTC CCGTCTAATA TGACCAGAAT GAGCTGGAAT ACCACACTGA CCAAATCTGG 3120 

ATCTTTGGAC TAAAGTATTC AAAATAGCAT AGCAAAGCTC ACTGTATTGG GCTAATAATT 3180 

TGGCACTTAT TAGCTTCTCT CATAAACTGA TCACGATTAT AAATTAAATG TTTGGGTTCA 3240 

TACCCCAAAA GCAATATGTT GTCACTCCTA ATTCTCAAGT ACTATTCAAA TTGTAGTAAA 3300 
TCTTAAAGTT TTTCAAAACC CTAAAATCAT ATTCGC 



Seq ID NO: 523 Protein sequence 
Protein Accession #i NP_001935.1 

1 11 21 31 41 51 

MMGLFPRTTG ALAIFWVIL VHGELRIETK GQYDEEEMTM QQAKRRQKRE WVKFAKPCRB 60 

GEDNSKRNPI AKITSDYQAT QKITYRISGV GIDQPPFGIP WDKNTGDIN ITAIVDREET 120 

PSFIiITCRAL NAQGLDVEKP LILTVKILDI NDNPPVFSQQ IFMGEIEENS ASNSI>VMILN 180 

ATDADEPNHL NSKIAFKIVS QEPAGTPMFIj LSRNTGEVRT LTNSLDREQA SSYRLWSGA 240 

DKDGEGLSTQ CECNIKVKDV NDNFPMFRDS QYSARIEENI LSSELLRFQV TDLDEEYTDN 300 

WLAVYFFTSG NEGNWFEIQT DPRTNEGILK WKALDYEQL QSVKLSIAVK NKAEFHQSVI 360 

SRYRVQSTPV TIQVINVREG IAFRPASKTF TVQKGISSKK LVDYILGTYQ AIDEDTNKAA 420 

SNVKYVMGRN DGGYLMIDSK TAEIKFVKNM NRDSTFIVNK TITAEVLAID EYTGKTSTGT 480 

VYVRVPDFND NCPTAVLEKD AVCSSSPSW VSARTLNNRY TGPYTFALED QPVKLPAVWS 540 

ITTLNATSAL LRAQEQIPPG VYHISLVLTD SQNNRCEMPR SLTLEVCQCD NRGICGTSYP 600 

TTSPGTRYGR PHSGRLGPAA IGLLLLGLLIi LLLAPLLLLT CDCGAGSTGG VTGGFIPVPD 660 

GSEGTIHQWG IEGAHPEDKE ITNICVPPVT ANGADFMESS EVCTNTYARG TAVEGTSGME 720 

MTTKLGAATE SGGAAGFATG TVSGAASGFG AATGVGICSS GQSGTMRTRH STGGTNKDYA 780 

DGAISMNFLD SYFSQKAFAC AEEDDGQEAN DCLLIYDNEG ADATGSPVGS VGCCSFIADD 840 

LDDSFLDSLG PKFKKLAEIS LGVDGEGKEV QPPSKDSGYG IESCGHPIEV QQTGFVKCQT 900 

LSGSQGASAL SASGSVQPAV SIPDPLQHGN YLVTETYSAS GSLVQPSTAG FDPLLTQNVI 960 
VTERVICPIS SVPGNLAGPT QLRGSHTMLC TEDPCSRLI 



Seq ID NO i 524 DNA sequence 

Nucleic Acid Accession #: XM_058069.2 

Coding sequence: 1. .1413 

1 11 21 31 41 51 

I I I I I I 

ATGAAGTTTC TTCTAATACT GCTCCTGCAG GCCACTGCTT CTGGAGCTCT TCCCCTGAAC 60 

AGCTCTACAA GCCTGGAAAA AAATAATGTG CTATTTGGTG AAAGATACTT AGAAAAATTT 120 

TATGGCCTTG AGATAAACAA ACTTCCAGTG ACAAAAATGA AATATAGTGG AAACTTAATG 180 

AAGGAAAAAA TCCAAGAAAT GCAGCACTTC TTGGGTCTGA AAGTGACCGG GCAACTGGAC 240 

ACATCTACCC TGGAGATGAT GCACGCACCT CGATGTGGAG TCCCCGATGT CCATCATTTC 300 

AGGGAAATGC CAGGGGGGCC CGTATGGAGG AAACATTATA TCACCTACAG AATCAATAAT 360 

TACACACCTG ACATGAACCG TGAGGATGTT GACTACGCAA TCCGGAAAGC TTTCCAAGTA 420 

TGGAGTAATG TTACCCCCTT GAAATTCAGC AAGATTAACA CAGGCATGGC TGACATTTTG 480 

GTGGTTTTTG CCCGTGGAGC TCATGGAGAC TTCCATGCTT TTGATGGCAA AGGTGGAATC 540 

CTAGCCCATG CTTTTGGACC TGGATCTGGC ATTGGAGGGG ATGCACATTT CGATGAGGAC 600 

GAATTCTGGA CTACACATTC AGGAGGCACA AACTTGTTCC TCACTGCTGT TCACGAGATT 660 

GGCCATTCCT TAGGTCTTGG CCATTCTAGT GATCCAAAGG CCGTAATGTT CCCCACCTAC 720 

AAATATGTTG ACATCAACAC ATTTCGCCTC TCTGCTGATG ACATACGTGG CATTCAGTCC 780 

CTGTATGGAG ACCCAAAAGA GAACCAACGC TTGCCAAATC CTGACAATTC AGAACCAGCT 840 

CTCTGTGACC CCAATTTGAG TTTTGATGCT GTCACTACCG TGGGAAATAA GATCTTTTTC 900 

TTCAAAGACA GGTTCTTCTG GCTGAAGGTT TCTGAGAGAC CAAAGACCAG TGTTAATTTA 960 

ATTTCTTCCT TATGGCCAAC CTTGCCATCT GGCATTGAAG CTGCTTATGA AATTGAAGCC 1020 

AGAAATCAAG TTTTTCTTTT TAAAGATGAC AAATACTGGT TAATTAGCAA TTTAAGACCA 1080 

GAGCCAAATT ATCCCAAGAG CATACATTCT TTTGGTTTTC CTAACTTTGT GAAAAAAATT 1140 

GATGCAGCTG TTTTTAACCC ACGTTTTTAT AGGACCTACT TCTTTGTAGA TAACCAGTAT 1200 

TGGAGGTATG ATGAAAGGAG ACAGATGATG GACCCTGGTT ATCCCAAACT GATTACCAAG 1260 

AACTTCCAAG GAATCGGGCC TAAAATTGAT GCAGTCTTCT ACTCTAAAAA CAAATACTAC 1320 

TATTTCTTCC AAGGATCTAA CCAATTTGAA TATGACTTCC TACTCCAACG TATCACCAAA 1380 
ACACTGAAAA GCAATAGCTG GTTTGGTTGT TGA 



Seq ID NO: 525 Protein sequence 
Protein Accession 8: P39900 



1 11 21 31 41 51 

I I I I I I 

MKFLLILXiliQ ATASGAIiPLN SSTSLEKNNV LFGERYLEKF YGLEINKLPV TKMKYSGNLM 60 
KEKIQEMQHF LGLKVTGQLD TSTLEMMHAP RCGVPDVHHF RBMPGGPVWR KHYITYRINN 120 
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YTPDMNREDV DYAIRKAFQV WSNVTPLKFS KINTGMADIL WFARGAHGD FHAFDGKGGI 180 

LAHAFGPGSG IGGDAHFDED EFWTTHSGGT NLFLTAVHEI GHSLGLGHSS DPKAVMFPTY 240 

KYVDINTPRL SADDIRGIQS LYGDPKENQR LPNPDNSEPA LCDPNLSFDA VTTVGNKIPF 300 

FKDRPFWLKV SERPKTSVNL ISSLWPTLPS GIEAAYEIBA RNQVFLPKDD KYWLISNLRP 360 

EPNYPKSIHS FGFPNPVKKI DAAVFNPRFY RTYFFVDNQY WRYDERRQMM DPGYPKLITK 420 
NPQGIGPKID AVFYSKNKYY YFFQGSNQFE YDFLLQRITK TLKSNSWPGC 

Seq ID NO: 526 DNA sequence 

Nucleic Acid Accession lh NM_024423.1 

Coding sequence: 64.. 2 590 

1 11 21 31 41 51 

I I I I I I 

GGCAGGTCTC GCTCTCGGCA CCCTCCCGGC GCCCGCGTTC TCCTGGCCCT GCCCGGCATC 60 

CCGATGGCCG CCGCTGGGCC CCGGCGCTCC GTGCGCGGAG CCGTCTGCCT GCATCTGCTG 120 

CTGACCCTCG TGATCTTCAG TCGTGATGGT GAAGCCTGCA AAAAGGTGAT ACTTAATGTA 180 

CCTTCTAAAC TAGAGGCAGA CAAAATAATT GGCAGAGTTA ATTTGGAAGA GTGCTTCAGG 240 

TCTGCAGACC TCATCCGGTC AAGTGATCCT GATTTCAGAG TTCTAAATGA TGGGTCAGTG 300 

TACACAGCCA GGGCTGTTGC GCTGTCTGAT AAGAAAAGAT CATTTACCAT ATGGCTTTCT 360 

GACAAAAGGA AACAGACACA GAAAGAGGTT ACTGTGCTGC TAGAACATCA GAAGAAGGTA 420 

TCGAAGACAA GACACACTAG AGAAACTGTT CTCAGGCGTG CCAAGAGGAG ATGGGCACCT 480 

ATTCCTTGCT CTATGCAAGA GAATTCCTTG GGCCCTTTCC CATTGTTTCT TCAACAAGTT 540 

GAATCTGATG CAGCACAGAA CTATACTGTC TTCTACTCAA TAAGTGGACG TGGAGTTGAT 600 

AAAGAACCTT TAAATTTGTT TTATATAGAA AGAGACACTG GAAATCTATT TTGCACTCGG 660 

CCTGTGGATC GTGAAGAATA TGATGTTTTT GATTTGATTG CTTATGCGTC AACTGCAGAT 720 

GGATATTCAG CAGATCTGCC CCTCCCACTA CCCATCAGGG TAGAGGATGA AAATGACAAC 780 

. CACCCTGTTT TCACAGAAGC AATTTATAAT TTTGAAGTTT TGGAAAGTAG TAGACCTGGT 840 

ACTACAGTGG GGGTGGTTTG TGCCACAGAC AGAGATGAAC CGGACACAAT GCATAOGCGC 900 

CTGAAATACA GCATTTTGCA GCAGACACCA AGGTCACCTG GGCTCTTTTC TGTGCATCCC 960 

AGCACAGGCG TAATCACCAC AGTCTCTCAT TATTTGGACA GAGAGGTTGT AGACAAGTAC 1020 

TCATTGATAA TGAAAGTACA AGACATGGAT GGCCAGTTTT TTGGATTGAT AGGCACATCA 1080 

ACTTGTATCA TAACAGTAAC AGATTCAAAT GATAATGCAC CCACTTTCAG ACAAAATGCT 1140 

TATGAAGCAT TTGTAGAGGA AAATGCATTC AATGTGGAAA TCTTACGAAT ACCTATAGAA 1200 

GATAAGGATT TAATTAACAC TGCCAATTGG AGAGTCAATT TTACCATTTT AAAGGGAAAT 1260 

GAAAATGGAC ATTTCAAAAT CAGCACAGAC AAAGAAACTA ATGAAGGTGT TCTTTCTGTT 1320 

GTAAAGCCAC TGAATTATGA AGAAAACCGT CAAGTGAACC TGGAAATTGG AGTAAACAAT 1380 

GAAGOGCCAT TTGCTAGAGA TATTCCCAGA GTGACAGCCT TGAACAGAGC CTTGGTTACA 1440 

GTTCATGTGA GGGATCTGGA TGAGGGGCCT GAATGCACTC CTGCAGCCCA ATATGTGCGG 1500 

ATTAAAGAAA ACTTAGCAGT GGGGTCAAAG ATCAACGGCT ATAAGGCATA TGACCCCGAA 1560 

AATAGAAATG GCAATGGTTT AAGGTACAAA AAATTGCATG ATCCTAAAGG TTGGATCACC 1620 

ATTGATGAAA TTTCAGGGTC AATCATAACT TCCAAAATCC TGGATAGGGA GGTTGAAACT 1680 

CCCAAAAATG AGTTGTATAA TATTACAGTC CTGGCAATAG ACAAAGATGA TAGATCATGT 1740 

ACTGGAACAC TTGCTGTGAA CATTGAAGAT GTAAATGATA ATCCACCAGA AATACTTCAA 1800 

GAATATGTAG TCATTTGCAA ACCAAAAATG GGGTATACCG ACATTTTAGC TGTTGATCCT 1860 

GATGAACCTG TCCATGGAGC TCCATTTTAT TTCAGTTTGC CCAATACTTC TCCAGAAATC 1920 

AGTAGACTGT GGAGCCTCAC CAAAGTTAAT GATACAGCTG CCCGTCTTTC ATATCAGAAA 1980 

AATGCTGGAT TTCAAGAATA TACCATTCCT ATTACTGTAA AAGACAGGGC CGGCCAAGCT 2040 

GCAACAAAAT TATTGAGAGT TAATCTGTGT GAATGTACTC ATCCAACTCA GTGTCGTGCG 2100 

ACTTCAAGGA GTACAGGAGT AATACTTGGA AAATGGGCAA TCCTTGCAAT ATTACTGGGT 2160 

ATAGCACTGC TCTTTTCTGT ATTGCTAACT TTAGTATGTG GAGTTTTTGG TGCAACTAAA 2220 

GGGAAACGTT TTCCTGAAGA TTTAGCACAG CAAAACTTAA TTATATCAAA CACAGAAGCA 2280 

CCTGGAGACG ATAGAGTGTG CTCTGCCAAT GGATTTATGA CCCAAACTAC CAACAACTCT 2340 

AGCCAAGGTT TTTGTGGTAC TATGGGATCA GGAATGAAAA ATGGAGGGCA GGAAACCATT 2400 

GAAATGATGA AAGGAGGAAA CCAGACCTTG GAATCCTGCC GGGGGGCTGG GCATCATCAT 2460 

ACCCTGGACT CCTGCAGGGG AGGACACACG GAGGTGGACA ACTGCAGATA CACTTACTCG 2520 

GAGTGGCACA GTTTTACTCA ACCCCGTCTC GGTGAAGAAT CCATTAGAGG ACACACTGGT 2580 

TAAAAATTAA ACATAAAAGA AATTGCATCG ATGTAATCAG AATGAAGACC GCATGCCATC 2640 

CCAAGATTAT GTCCTCACTT ATAACTATGA GGGAAGAGGA TCTCCAGCTG GTTCTGTGGG 2700 

CTGCTGCAGT GAAAAGCAGG AAGAAGATGG CCTTGACTTT TTAAATAATT TGGAACCCAA 2760 

ATTTATTACA TTAGCAGAAG CATGCACAAA GAGATAATGT CACAGTGCTA CAATTAGGTC 2820 

TTTGTCAGAC ATTCTGGAGG TTTCCAAAAA TAATATTGTA AAGTTCAATT TCAACATGTA 2880 

TGTATATGAT GATTTTTTTC TCAATTTTGA ATTATGCTAC TCACCAATTT ATATTTTTAA 2940 

AGCCAGTTGT TGCTTATCTT TTCCAAAAAG TGAAAAATGT TAAAACAGAC AACTGGTAAA 3000 

TCTCAAACTC CAGCACTGGA ATTAAGGTCT CTAAAGCATC TGCTCTTTTT TTTTTTTACG 3060 

GATATTTTAG TAATAAATAT GCTGGATAAA TATTAGTCCA ACAATAGCTA AGTTATGCTA 3120 

ATATCACATT ATTATGTATT CACTTTAAGT GATAGTTTAA AAAATAAACA AGAAATATTG 3180 

AGTATCACTA TGTGAAGAAA GTTTTGGAAA AGAAACAATG AAGACTGAAT TAAATTAAAA 3240 

ATGTTGCAGC TCATAAAGAA TTGGGACTCA CCCCTACTGC ACTACCAAAT TCATTTGACT 3300 

TTGGAGGCAA AATGTGTTGA AGTGCCCTAT GAAGTAGCAA TTTTCTATAG GAATATAGTT 3360 

GGAAATAAAT GTGTGTGTGT ATATTATTAT TAATCAATGC AATATTTAAA ATGAAATGAG 3420 

AACAAAGAGG AAAATGGTAA AAACTTGAAA TGAGGCTGGG GTATAGTTTG TCCTACAATA 3480 

GAAAAAAGAG AGAGCTTCCT AGGCCTGGGC TCTTAAATGC TGCATTATAA CTGAGTCTAT 3540 

GAGGAAATAG TTCCTGTCCA ATTTGTGTAA TTTGTTTAAA ATTGTAAATA AATTAAACTT 3600 

TTCTGGTTTC TGTGGGAAGG AAATAGGGAA TCCAATGGAA CAGTAGCTTT GCTTTGCAGT 3660 

CTGTTTCAAG ATTTCTGCAT CCACAAGTTA GTAGCAAACT GGGGAATACT CGCTGCAGCT 3720 

GGGGTTCCCT GCTTTTTGGT AGCAAGGGTC CAGAGATGAG GTGTTTTTTT CGGGGAGCTA 3780 

ATAACAAAAA CATTTTAAAA CTTACCTTTA CTGAAGTTAA ATCCTCTATT GCTGTTTCTA 3840 

TTCTCTCTTA TAGTGACCAA CATCTTTTTA ATTTAGATCC AAATAACCAT GTCCTCCTAG 3900 

AGTTTAGAGG CTAGAGGGAG CTGAGGGGAG GATCTTACTG AAAGCACCCT GGGGAGATTG 3960 

ATTGTCCTTA AACCTAAGCC CCACAAACTT GACACCTGAT CAGGTCTGGG AGCTACAAAA 4020 

TTTCATTTTT CTCCTCACTG CCCTTCTTCT GAGTGGCATT GGCCTGAATC AAGGAAAGCC 4080 

AGGCCTTGTG GGCCCCCTTC TTTCGGCTTT CTGCTAAAGC AACACCTCCA GCAGAGATTC 4140 

CCTTAAGTGA CTCCAGGTTT TCCACCATCC TTCAGCGTGA ATTAATTTTT AATCAGTTTG 4200 

CTTTCTCCAG AGAAATTTTA AAATAATAGA AGAAATAGAA ATTTTGAATG TATAAAAGAA 4260 

AAAGATCAAG TTGTCATTTT AGAACAGAGG GAACTTTGGG AGAAAGCAGC CCAAGTAGGT 4320 

TATTTGTACA GTCAGAGGGC AACAGGAAGA TGCAGGCCTT CAAGGGCAAG GAGAGGCCAC 4380 

AAGGAATATG GGTGGGAGTA AAAGCAAGAT CGTCTGCTTC ATACTTTTTC CTAGGCTTGG 4440 



386 



WO 02/086443 

CACTGCCTTT TCCTTTCTCA GGCCAATGGC AACTGCCATT TGAGTCCGGT GAGGGATCAG 45 00 

CCAACCTCTT CTCTATGGCT CACCTTATTT GGAGTGAGAA ATCAAGGAGA CAGAGCTGAC 4560 

TGCATGATGA GTCTGAAGGC ATTTGCAGGA TGAGCCTGAA CTGGTTGTGC AGAACAAACA 4620 

AGGCATTCAT GGGAATTGTT GTATTCCTTC TGCAGCCCTC CTTCTGGGCA CTAAGAAGGT 4680 

CTATGAATTA AATGCCTATC TAAAATTCTG ATTTATTCCT ACATTTTCTG TTTTCTAATT 4740 

TGACCCTAAA ATCTATGTGT TTTAGACTTA GACTTTTTAT TGCCCCCCCC CCCTTTTTTT 4800 

TTGAGACGGA GTCTCGCTCT GACGCACAGG CTGGAGTGCA GTGGCTCCGA TCTCTGCTCA 4860 

CTGAAAGCTC CGCCTCCCGG GTTCATGCCA TTCTCCTGCC TCAGCCTCCT GAGTAGCTGG 4920 

GACTAGAGGC GCCCACCACC ACGCCOGGCT AATTTTTTGT ATTTTTAATA GAGACGGGGT 4980 

TTCACTGTGT TAGCCAGGAT GGTCTCGATC TCCTGACCTC GTGATCCGCC TGCCTCGGCC 5040 

TCCCAAAGTO CTGGGATTAC AGGCATGACC CACCGCTCCC GGCCTTGTTT TCCGTTTAAA 51 00 

GTCGTCTTCT TTTAATGTAA TCATTTTGAA CATGTGTGAA AGTTGATCAT ACGAATTGGA 5160 

TCAATCTTGA AATACTCAAC CAAAAGACAG TCGAGAAGCC AGGGGGAGAA AGAACTCAGG 5220 

GCACAAAATA TTGGTCTGAG AATGGAATTC TCTGTAAGCC TAGTTGCTGA AATTTCCTGC 5280 

TGTAACCAGA AGCCAGTTTT ATCTAACGGC TACTGAAACA CCCACTGTGT TTTGC TCAC T 5340 

CCCACTCACC GATCAAAACC TGCTACCTCC CCAAGACTTT ACTAGTGCCG ATAAACTTTC 5400 

TCAAAGAGCA ACCAGTATCA CTTCCCTGTT TATAAAACCT CTAACCATCT CTTTGTTCTT 5460 

TGAACATGCT GAAAACCACC TGGTCTGCAT GTATGCCCGA ATTTGTAATT CTTTTCTCTC 5520 

AAATGAAAAT TTAATTTTAG GGATTCATTT CTATATTTTC ACATATGTAG TATTATTATT 5580 

TCCTTATATG TGTAAGGTGA AATTTATGGT ATTTGAGTGT GCAAGAAAAT ATATTTTTAA 5640 

AGCTTTCATT TTTCCCCCAG TGAATGATTT AGAATTTTTT ATGTAAATAT ACAGAATGTT 5700 

TTTTCTTACT TTTATAAGGA AGCAGCTGTC TAAAATGCAG TGGGGTTTGT TTTGCAATGT 5760 

TTTAAACAGA GTTTTAGTAT TGCTATTAAA AGAAGTTACT TTGCTTTTAA AGAAACTTGG 5820 

CTGCTTAAAA TAAGCAAAAA TTGGATGCAT AAAGTAATAT TTACAGATGT GGGGAGATGT 5880 

AATAAAACAA TATTAACTTG GCTGCTTAAA ATAAGCAAAA ATTGGATGCA TAAAGTAATA 5940 

TTTACAGATG TGGGGAGATG TAATAAAACA ATATTAACTT GGTTTCTTGT TTTTGCTGTA 6000 

TTTAGAGATT . AAATAATTCT AAGATGATCA CTTTGCAAAA TTATGCTTAT GGCTGGCATG 6060 

GAAATAGAAA TACTCAATTA TGTCTTTGTT GTATTAATGG GGAATATTTT GGACAATGTT 6120 

TCATTATCAA ATTGTCGACA TCATTAATAT ATATTGTAAT GTTGGGAAGA GATCACTATT 61 BO 

TTGAAGCACA GCTTTACAGA TGAGTATCTA TGATACATAT GTATAATAAA TTTTGATCGG 6240 

GTATTAAAAG TATTAGAAGG TGGTTATAAT TGCAGAGTAT TCCATGAATA GTACACTGAC 6300 

ACAGGGGTTT TACTTTGAGG ACCAGTGTAG TCAAGGGAAA ACATGAGTTA AAAAGAAAAG 6360 

CAGGCAATAT TGCAGTCTTG ATTCTGCCAC TTACAGGATA GATAATGCCT GAACTTTAAT 6420 

GACAAGATGA TCCAACCATA AAGGTGCTCT GTGCTTCACA GTGAATCTTT TCCCCATGCA 6480 

GGAGTGTGCT CCCCTACAAA CGTTAAGACT GATCATTTCA AAAATCTATT AGCTATATCA 6540 

AAAGCCTTAC ATTTTAATAT AGGTTGAACC AAAATTTCAA TTCCAGTAAC TTCTATTGTA 6600 

ACCATTATTT TTGTGTATGT CTTCAAGAAT GTTCATTGGA TTTTTGTTTG TAATAGTAAA 6660 

ATACCGGATA CATTTCACGT GTCCTTCAGT ATTGATTTGG TTGAATATTG GGTCATAATG 6720 

GTTGAGAAGC ATGGACACTA GAGCCAGAAT GCTTGGATAT GAATCCTGGA TCTGTCACTT 6780 

ACTTCTGTGT GACCTTTGAA AGGCTACTTA TTTCCTCTCT TAGCTTTCTC ATTAAAATCA 6840 

ATGAACAATG CCAGCCTCAT GGGGTTGTTG AATGATTAAA TTAGTTAATA TACCTAAAGT 6900 

ACATAGAACA CTGCCTGCAC ATAGTAAAAG AATTATAAGT GTGAGGTAGT TGGTAAAATT 6960 

ATGTAGTTGG ATATACTACC GAACAATATC TAATCTCTTT TTAGGGAAAT AAAGTTTGTG 7020 
CATATATATA ATCCCGAAAC ATG 

Seq ID NO; 527 Protein sequence 
Protein Accession #: NP_077741.l 

1 11 21 31 41 51. 

I 1 I I I I 

MAAAGPRRSV RGAVCLHLLL TLVIFSRDGE ACKKVILNVP SKLEADKI IG RVNLEECFRS 60 
ADLIRSSDPD FRVLNDGSVY TARAVALSDK KRSFTIWLSD KRKQTQKEVT VLLEHQKKVS 120 
KTRHTRETVL RRAKRRWAPI PCSMQENSLG PFPLFLQQVE SDAAQNYTVF YSISGRGVDK 180 
EPLNLFYIER DTGNLFCTRP VDREEYDVFD LIAYASTADG YSADLPLPLP IRVEDENDNH 240 
PVFTEAIYNF EVLESSRPGT TVGWCATDR DEPDTMHTRL KYSILQQTPR SPGLFSVHPS 300 
TGVITTVSHY LDREWDKYS LIMKVQDMDG QFFGLIGTST CI ITVTDSND * NAPTFRQNAY 360 
EAFVEENAFN VEILRIPIED KDLINTANWR VNFTILKGNE NGHFKISTDK ETNEGVLSW 420 
KPLNYBENRQ VNLEIGVNNE APFARDIPRV TALNRALVTV HVRDLDEGPE CTPAAQYVRI 480 
KENLAVGSKI NGYKAYDPEN RNGNGLRYKK LHDPKGWITI DEISGSIITS KILDREVETP 540 
KNELYNITVL AIDKDDRSCT GTLAVNIEDV NDNPPEILQE YWICKPKMG YTDILAVDPD 600 
EPVHGAPFYF SLPNTSPEIS RLWSLTKVND TAARLSYQKN AGFQEYTIPI TVKDRAGQAA 660 
TKLLRVNLCE CTHPTQCRAT SRSTGVILGK WAILAILLGI ALLFSVLLTL VCGVFGATKG 720 
KRFPEDLAQQ NLIISNTEAP GDDRVCSANG FMTQTTNNSS QGFCGTMGSG MKNGGQETIE 780 
MMKGGNQTLE SCRGAGHHHT LDSCRGGHTE VDNCRYTYSE WHSFTQPRLG EESIRGHTG 

Seq ID NO: 528 DNA sequence 

Nucleic Acid Accession #: NMJJ01941.2 

Coding sequence: 64.. 2754 

1 11 21 31 41 51 

I I I I I I 

GGCAGGTCTC GCTCTCGGCA CCCTCCCGGC GCCCGCGTTC TCCTGGCCCT GCCCGGCATC 60 
CCGATGGCCG CCGCTGGGCC CCGGCGCTCC GTGCGCGGAG CCGTCTGCCT GCATCTGCTG 120 
CTGACCCTCG TGATCTTCAG TCGTGATGGT GAAGCCTGCA AAAAGGTGAT ACTTAATGTA 180 
CCTTCTAAAC TAGAGGCAGA CAAAATAATT GGCAGAGTTA ATTTGGAAGA GTGCTTCAGG 240 
TCTGCAGACC TCATCCGGTC AAGTGATCCT GATTTCAGAG TTCTAAATGA TGGGTCAGTG 300 
TACACAGCCA GGGCTGTTGC GCTGTCTGAT AAGAAAAGAT CATTTACCAT ATGGCTTTCT 360 
GACAAAAGGA AACAGACACA GAAAGAGGTT ACTGTGCTGC TAGAACATCA GAAGAAGGTA 420 
TCGAAGACAA GACACACTAG AGAAACTGTT CTCAGGCGTG CCAAGAGGAG ATGGGCACCT 480 
ATTCCTTGCT CTATGCAAGA GAATTCCTTG GGCCCTTTCC CATTGTTTCT TCAACAAGTT 540 
GAATCTGATG CAGCACAGAA CTATACTGTC TTCTACTCAA TAAGTGGACG TGGAGTTGAT 600 
AAAGAACCTT TAAATTTGTT TTATATAGAA AGAGACACTG GAAATCTATT TTGCACTCGG 660 
CCTGTGGATC GTGAAGAATA TGATGTTTTT GATTTGATTG CTTATGCGTC AACTGCAGAT 720 
GGATATTCAG CAGATCTGCC CCTCCCACTA CCCATCAGGG TAGAGGATGA AAATGACAAC 780 
CACCCTGTTT TCACAGAAGC AATTTATAAT TTTGAAGTTT TGGAAAGTAG TAGACCTGGT 840 
ACTACAGTGG GGGTGGTTTG TGCCACAGAC AGAGATGAAC CGGACACAAT GCATACGCGC 900 
CTGAAATACA GCATTTTGCA GCAGACACCA AGGTCACCTG GGCTCTTTTC TGTGCATCCC 960 
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AGCACAGGOG TAATCACCAC AGTCTCTCAT TATTTGGACA GAGAGGTTGT AGACAAGTAC 1020 

TCATTGATAA TGAAAGTACA AGACATGGAT GGCCAGTTTT TTGGATTGAT AGGCACATCA 1080 

ACTTGTATCA TAACAGTAAC AGATTCAAAT GATAATGCAC CCACTTTCAG ACAAAATGCT 1140 

TATGAAGCAT TTGTAGAGGA AAATGCATTC AATGTGGAAA TCTTACGAAT ACCTATAGAA 1200 

GATAAGGATT TAATTAACAC TGCCAATTGG AGAGTCAATT TTACCATTTT AAAGGGAAAT 1260 

GAAAATGGAC ATTTCAAAAT CAGCACAGAC AAAGAAACTA ATGAAGGTGT TCTTTCTGTT 1320 

GTAAAGCCAC TGAATTATGA AGAAAACCGT CAAGTGAACC TGGAAATTGG AGTAAACAAT 1380 

GAAGCGCCAT TTGCTAGAGA TATTCCCAGA GTGACAGCCT TGAACAGAGC CTTGGTTACA 1440 

GTTCATGTGA GGGATCTGGA TGAGGGGCCT GAATGCACTC CTGCAGCCCA ATATGTGCGG 1SO0 

ATTAAAGAAA ACTTAGCAGT GGGGTCAAAG ATCAACGGCT ATAAGGCATA TGACCCCGAA 1560 

AATAGAAATG GCAATGGTTT AAGGTACAAA AAATTGCATG ATCCTAAAGG TTGGATCACC 1620 

ATTGATGAAA TTTCAGGGTC AATCATAACT TCCAAAATCC TGGATAGGGA GGTTGAAACT 1680 

CCCAAAAATG AGTTGTATAA TATTACAGTC CTGGCAATAG ACAAAGATGA TAGATCATGT 1740 

ACTGGAACAC TTGCTGTGAA CATTGAAGAT GTAAATGATA ATCCACCAGA AATACTTCAA 1800 

GAATATGTAG TCATTTGCAA ACCAAAAATG" GGGTATACCG ACATTTTAGC TGTTGATCCT 1860 

GATGAACCTG TCCATGGAGC TCCATTTTAT TTCAGTTTGC CCAATACTTC TCCAGAAATC 1920 

AGTAGACTGT GGAGCCTCAC CAAAGTTAAT GATACAGCTG CCCGTCTTTC ATATCAGAAA 1980 

AATGCTGGAT TTCAAGAATA TACCATTCCT ATTACTGTAA AAGACAGGGC CGGCCAAGCT 2040 

GCAACAAAAT TATTGAGAGT TAATCTGTGT GAATGTACTC ATCCAACTCA GTGTCGTGCG 2100 

ACTTCAAGGA GTACAGGAGT AATACTTGGA AAATGGGCAA TCCTTGCAAT ATTACTGGGT 2160 

ATAGCACTGC TCTTTTCTGT ATTGCTAACT TTAGTATGTG GAGTTTTTGG TGCAACTAAA 2220 

GGGAAACGTT TTCCTGAAGA TTTAGCACAG CAAAACTTAA TTATATCAAA CACAGAAGCA 2280 

CCTGGAGACG ATAGAGTGTG CTCTGCCAAT GGATTTATGA CCCAAACTAC CAACAACTCT 2340 

AGCCAAGGTT TTTGTGGTAC TATGGGATCA GGAATGAAAA ATGGAGGGCA GGAAACCATT 2400 

GAAATGATGA AAGGAGGAAA CCAGACCTTG GAATCCTGCC GGGGGGCTGG GCATCATCAT 2460 

ACCCTGGACT CCTGCAGGGG AGGACACACG GAGGTGGACA ACTGCAGATA CACTTACTCG 2520 

GAGTGGCACA GTTTTACTCA ACCCOGTCTC GGTGAAAAAT . TGCATCGATG TAATCAGAAT 2580 

GAAGACCGCA TGCCATCCCA AGATTATGTC CTCACTTATA ACTATGAGGG AAGAGGATCT 2640 

CCAGCTGGTT CTGTGGGCTG CTGCAGTGAA AAGCAGGAAG AAGATGGCCT TGACTTTTTA 2700 

AATAATTTGG AACCCAAATT TATTACATTA GCAGAAGCAT GCACAAAGAG ATAATGTCAC 2760 

AGTGCTACAA TTAGGTCTTT GTCAGACATT CTGGAGGTTT CCAAAAATAA TATTGTAAAG 2820 

TTCAATTTCA ACATGTATGT ATATGATGAT TTTTTTCTCA ATTTTGAATT ATGCTACTCA 2880 

CCAATTTATA TTTTTAAAGC CAGTTGTTGC TTATCTTTTC CAAAAAGTGA AAAATGTTAA 2940 

AACAGACAAC TGGTAAATCT CAAACTCCAG CACTGGAATT AAGGTCTCTA AAGCATCTGC 3000 

TCTTTTTTTT TTTTACGGAT ATTTTAGTAA TAAATATGCT GGATAAATAT TAGTCCAACA 3060 

ATAGCTAAGT TATGCTAATA TCACATTATT ATGTATTCAC TTTAAGTGAT AGTTTAAAAA 3120 

ATAAACAAGA AATATTGAGT ATCACTATGT GAAGAAAGTT TTGGAAAAGA AACAATGAAG 3180 

ACTGAATTAA ATTAAAAATG TTGCAGCTCA TAAAGAATTG GGACTCACCC CTACTGCACT 3240 

ACCAAATTCA TTTGACTTTG GAGGCAAAAT GTGTTGAAGT GCCCTATGAA GTAGCAATTT 3300 

TCTATAGGAA TATAGTTGGA AATAAATGTG TGTGTGTATA TTATTATTAA TCAATGCAAT 3360 

ATTTAAAATG AAATGAGAAC AAAGAGGAAA ATGGTAAAAA CTTGAAATGA GGCTGGGGTA 3420 

TAGTTTGTCC TACAATAGAA AAAAGAGAGA GCTTCCTAGG CCTGGGCTCT TAAATGCTGC 3480 

ATTATAACTG AGTCTATGAG GAAATAGTTC CTGTCCAATT TGTGTAATTT GTTTAAAATT 3540 

GTAAATAAAT TAAACTTTTC TGGTTTCTGT GGGAAGGAAA TAGGGAATCC AATGGAACAG 3600 

TAGCTTTGCT TTGCAGTCTG TTTCAAGATT TCTGCATCCA CAAGTTAGTA GCAAACTGGG 3660 

GAATACTCGC TGCAGCTGGG GTTCCCTGCT TTTTGGTAGC AAGGGTCCAG AGATGAGGTG 3720 

TTTTTTTCGG GGAGCTAATA ACAAAAACAT TTTAAAACTT ACCTTTACTG AAGTTAAATC 3780 

CTCTATTGCT GTTTCTATTC TCTCTTATAG TGACCAACAT CTTTTTAATT TAGATCCAAA 3840 

TAACCATGTC CTCCTAGAGT TTAGAGGCTA GAGGGAGCTG AGGGGAGGAT CTTACTGAAA 3900 

GCACCCTGGG GAGATTGATT GTCCTTAAAC CTAAGCCCCA CAAACTTGAC ACCTGATCAG 3960 

GTCTGGGAGC TACAAAATTT CATTTTTCTC CTCACTGCCC TTCTTCTGAG TGGCATTGGC 4020 

CTGAATCAAG GAAAGCCAGG CCTTGTGGGC CCCCTTCTTT CGGCTTTCTG CTAAAGCAAC 4080 

ACCTCCAGCA GAGATTCCCT TAAGTGACTC CAGGTTTTCC ACCATCCTTC AGCGTGAATT 4140 

AATTTTTAAT CAGTTTGCTT TCTCCAGAGA AATTTTAAAA TAATAGAAGA AATAGAAATT 4200 

TTGAATGTAT AAAAGAAAAA GATCAAGTTG TCATTTTAGA ACAGAGGGAA CTTTGGGAGA 4260 

AAGCAGCCCA AGTAGGTTAT TTGTACAGTC AGAGGGCAAC AGGAAGATGC AGGCCTTCAA 4320 

GGGCAAGGAG AGGCCACAAG GAATATGGGT GGGAGTAAAA GCAACATCGT CTGCTTCATA 4380 

CTTTTTCCTA GGCTTGGCAC TGCCTTTTCC TTTCTCAGGC CAATGGCAAC TGCCATTTGA 4440 

GTCCGGTGAG GGATCAGCCA ACCTCTTCTC TATGGCTCAC CTTATTTGGA GTGAGAAATC 4500 

AAGGAGACAG AGCTGACTGC ATGATGAGTC TGAAGGCATT TGCAGGATGA GCCTGAACTG 4560 

GTTGTGCAGA ACAAACAAGG CATTCATGGG AATTGTTGTA TTCCTTCTGC AGCCCTCCTT 4620 

CTGGGCACTA AGAAGGTCTA TGAATTAAAT GCCTATCTAA AATTCTGATT TATTCCTACA 4680 

TTTTCTGTTT TCTAATTTGA CCCTAAAATC TATGTGTTTT AGACTTAGAC TTTTTATTGC 4740 

CCCCCCCCCC TTTTTTTTTG AGACGGAGTC TCGCTCTGAC GCACAGGCTG GAGTGCAGTG 4 BOO 

GCTCCGATCT CTGCTCACTG AAAGCTCCGC CTCCCGGGTT CATGCCATTC TCCTGCCTCA 4860 

GCCTCCTGAG TAGCTGGGAC TACAGGCGCC CACCACCACG CCCGGCTAAT TTTTTGTATT 4920 

TTTAATAGAG ACGGGGTTTC ACTGTGTTAG CCAGGATGGT CTCGATCTCC TGACCTCGTG 4980 

ATCCGCCTGC CTCGGCCTCC CAAAGTGCTG GGATTACAGG CATGACCCAC CGCTCCCGGC 5040 

CTTGTTTTCC GTTTAAAGTC GTCTTCTTTT AATGTAATCA TTTTGAACAT GTGTGAAAGT 5100 

TGATCATACG AATTGGATCA ATC7TGAAAT ACTCAACCAA AAGACAGTCG AGAAGCCAGG 5160 

GGGAGAAAGA ACTCAGGGCA CAAAATATTG GTCTGAGAAT GGAATTCTCT GTAAGCCTAG 5220 

TTGCTGAAAT TTCCTGCTGT AACCAGAAGC CAGTTTTATC TAACGGCTAC TGAAACACCC 5280 

ACTGTGTTTT GCTCACTCCC TCACTCACCG ATCAAAACCT GCTACCTCCC CAAGACTTTA 5340 

CTAGTGCOGA TAAACTTTCT CAAAGAGCAA CCAGTATCAC TTCCCTGTTT ATAAAACCTC 5400 

TAACCATCTC TTTGTTCTTT GAACATGCTG AAAACCACCT GGTCTGCATG TATGCCCGAA 5460 

TTTGTAATTC TTTTCTCTCA AATGAAAATT TAATTTTAGG GATTCATTTC TATATTTTCA 5520 

CATATGTAGT ATTATTATTT CCTTATATGT GTAAGGTGAA ATTTATGGTA TTTGAGTGTG 5580 

CAAGAAAATA TATTTTTAAA GCTTTCATTT TTCCCCCAGT GAATGATTTA GAATTTTTTA 5640 

TGTAAATATA CAGAATGTTT TTTCTTACTT TTATAAGGAA GCAGCTGTCT AAAATGCAGT 5700 

GGGGTTTGTT TTGCAATGTT TTAAACAGAG TTTTAGTATT GCTATTAAAA GAAGTTACTT 5760 

TGCTTTTAAA GAAACTTGGC TGCTTAAAAT AAGCAAAAAT TGGATGCATA AAGTAATATT 5820 

TACAGATGTG GGGAGATGTA ATAAAACAAT ATTAACTTGG TTTCTTGTTT TTGCTGTATT 5880 

TAGAGATTAA ATAATTCTAA GATGATCACT TTGCAAAATT ATGCTTATGG CTGGCATGGA 5940 

AATAGAAATA CTCAATTATG TCTTTGTTGT ATTAATGGGG AATATTTTGG ACAATGTTTC 6000 

ATTATCAAAT TGTCGACATC ATTAATATAT ATTGTAATGT TGGGAAGAGA TCACTATTTT 6060 

GAAGCACAGC TTTACAGATG AGTATCTATG ATACATATGT ATAATAAATT TTGATCGGGT 6120 

ATTAAAAGTA TTAGAAGGTG GTTATAATTG CAGAGTATTC CATGAATAGT ACACTGACAC 6180 
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AGGGGTTTTA CTTTGAGGAC CAGTGTAGTC AAGGGAAAAC ATGAGTTAAA A AGAA AAGCA 
GGCAATATTG CAGTCTTGAT TCTGCCACTT ACAGGATAGA TAATGCCTGA ACTTTAATGA 
CAAGATGATC CAACCATAAA GGTGCTCTGT GCTTCACAGT GAATCTTTTC CCCATGCAGG 
AGTGTGCTCC CCTACAAACG TTAAGACTGA TCATTTCAAA AATCTATTAG CTATATCAAA 
AGCCTTACAT TTTAATATAG GTTGAACCAA AATTTCAATT CCAGTAACTT CTATTGTAAC 
CATTATTTTT GTGTATGTCT TCAAGAATGT TCATTGGATT TTTGTTTGTA ATAGTAAAAT 
ACCGGATACA TTTCACGTGT CCTTCAGTAT TGATTTGGTT GAATATTGGG TCATAATGGT 
TGAGAAGCAT GGACACTAGA GCCAGAATGC TTGGATATGA ATCCTGGATC TGTCACTTAC 
TTCTGTGTGA CCTTTGAAAG GCTACTTATT TCCTCTCTTA GCTTTCTCAT TAAAATCAAT 
GAACAATGCC AGCCTCATGG GGTTGTTGAA TGATTAAATT AGTTAATATA CCTAAAGTAC 
ATAGAACACT GCCTGCACAT AGTAAAAGAA TTATAAGTGT GAGGTAGTTG GTAAAATTAT 
GTAGTTGGAT ATACTACCGA ACAATATCTA ATCTCTTTTT AGGGAAATAA AGTTTGTGCA 
TATATATAAT CCCGAAACAT G 

Seq ID NO: 529 Protein sequence 
Protein Accession #: NP 001932.1 
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MAAAGPRRSV 
ADLZRSSDPO 
KTRHTRETVL 
EPLNLFYIER 
PVFTEAIYNF 
TGVITTVSHY 
EAFVEENAFN 
KPLNYEENRQ 
KENLAVGSKI 
KNELYNITVL 
EPVHGAPFYF 
TKLLRVNLCB 
KRFPEDLAQQ 
MMKGGNQTLE 
DRMPSQDYVL 



11 
I 

RGAVCLHLLL 
FRVLNDGSVY 
RRAKRRWAPI 
DTGNLFCTRP 
EVLESSRPGT 
LDREWDKYS 
VEILRIPIED 
VNLEIGVNNE 
NGYKAYDPEN 
AIDKDDRSCT 
SfcPNTSPEIS 
CTHPTQCRAT 
NLIISNTEAP 
SCRGAGHHHT 
TYNYEGRGSP 



21 
I 

TLVIFSRDGE 
TARAVALSDK 
PCSMQENSLG 
VDREEYDVFD 
TVGWCATDR 
LIMKVQDMDG 
KDLINTANWR 
APFARDIPRV 
RNGNGLRYKK 
GTLAVNIEDV 
RLWSLTKVND 
SRSTGVILGK 
GDDRVCSANG 
LDSCRGGHTE 
AGSVGCCSEK 



31 
I 

ACKKVILNVP 
KRSFTIWLSD 
PFPLFLQQVE 
LIAYASTADG 
DEPDTMHTRL 
QFFGLIGTST 
VNFTILKGNE 
TALNRALVTV 
LHDPKGWITI 
NDNPPEILQE 
TAARLSYQKN 
WAILAILLGI 
FMTQTTNNSS 
VDNCRYTYSE 
QEEDGLDFLN 



41 
I 

SKLEADKIIG 
KRKQTQKEVT 
SDAAQNYTVF 
YSADLPLPLP 
KYS1LQQTPR 
CIITVTDSND 
NGHFKISTDK 
HVRDLDEGPE 
DEISGSIITS 
YWICKPKMG 
AGFQEYTIPI 
ALLFSVLLTL 
QGFCGTMGSG 
WHSFTQPRLG 
NLEPKFITLA 



51 



Seq ID NO: 530 DNA sequence 

Nucleic Acid Accession #: NM_016583.2 

Coding sequence : 72 . . 842 



GGAGTGGGGG 
TAAGAGCAAA 
CCATGGCCCA 
ATCCAGCCCT 
ATGGCCTGCT 
TGAAGCCTGG 
CAGTGATTCC 
AACTTGGCCT 
TAAAGCTCCA 
TGGACATCAC 
TTGGTGACTG 
CCCTCCCCAT 
AGTTGGTTCA 
CCCTGGTGCA 
AAGCCTTCCA 
GCCCATGTGC 
TCCCACCAGG 
AAAAAAAAAA 



11 
I 

AGAGAGAGGA 
GATGTTTCAA 
GTTTGGAGGC 
GCCCTTGAGT 
GTCTGGGGGC 
AGGAGGTACT 
TGGCCTGAAC 
TGTGCAGAGC 
AGTGAATACG 
TGCAGAAATC 
CACCCATTCC 
TCAAGGTCTT 
GGGCAACGTG 
TGACATTGTT 
GGAAGGGGCT 
TGGAAGATGA 
CGTGTGTAAC 
AAAAAAAAAA 



21 

I 

GACCAGGACA 
ACTGGGGGCC 
CTGCCCGTGC 
CCCACAGGTC 
CTGTTGGGCA 
TCTGGTGGCC 
AACATCATTG 
CCTGATGGCC 
CCCCTGGTCG 
TTAGCTGTGA 
CCTGGAAGCC 
CTGGACAGCC 
TGCCCTCTGG 
AACATGCTGA 
GGCCTCTGCT 
CACAGTTGCC 
ATCCCATGTG 
AAAAAAAAA 



31 
I 

GCTGCTGAGA 
TCATTGTCTT 
CCCTGGACCA 
TTGCAGGAAG 
TTCTGGAAAA 
TCCTTGGGGG 
ACATAAAGGT 
ACCGTCTCTA 
GTGCAAGTCT 
GAGATAAGCA 
TGCAAATTTC 
TCACAGGGAT 
TCAATGAGGT 
TCCACGGACT 
GAGCTGCTTC 
TTCTCTCCGA 
CCTCACCTAA 



41 
I 

CCTCTAAGAA 
CTACGGGCTG 
GACCCTGCCC 
CTTGACAAAT 
CCTTCCGCTC 
ACTGCTTGGA 
CACTGACCCC 
TGTCACCATC 
GTTGAGGCTG 
GGAGAGGATC 
TCTGCTTGAT 
CTTGAATAAA 
TCTCAGAGGC 
ACAGTTTGTC 
CCAGTGCTCA 
GGAACCTGCC 
TAAAATGGCT 



I 

RVNLEECFRS 
VLLEHQKKVS 
YSISGRGVDK 
IRVEDENDNK 
SPGLFSVHPS 
NAPTFRQNAY 
ETNEGVLSW 
CTPAAQYVRI 
KILDREVETP 
YTDILAVDPD 
TVKDRAGQAA 
VCGVFGATKG 
MKNGGQETIE 
EKLHRCNQNE 
EACTKR 



51 
I 

GTCCAGATAC 
TTAGCCCAGA 
TTGAATGTGA 
GCCCTCAGCA 
CTGGACATCC 
AAAGTGACGT 
CAGCTGCTGG 
CCTCTCGGCA 
GCXGTGAAGC 
CACCTGGTCC 
GGACTTGGCC 
GTCCTGCCTG 
TTGGACATCA 
ATCAAGGTCT 
CAGATGGCTG 
CCCTCTCCTT 
CTTCTTCTGC 



Seq ID NO: 531 Protein sequence 
Protein Accession #: NPJ)57667.1 



11 



21 



31 



41 51 

I 1 I I I 

MFQTGGLIVF YGLLAQTMAQ FGGIiPVPIiDQ TLPLNVNPAL PLSPTGLAGS LTNALSNGLL 
SGGLLGILEN LPLLDILKPG GGTSGGLLGG LLGKVTSVIP GLNNIIDIKV TDPQLIiELGI* 
VQSPDGHRLY VTIPLGIKLQ VNTPLVGASL LRLAVKLDIT AEILAVRDKQ ERIHLVLGDC 
THSPGSLQIS LLDGLGPLPI QGLLDSLTGI LNKVLPELVQ GNVCPLVNEV LRGLDITLVH 
DIVimiilKGl* QFVIKV 



Seq ID NO; 532 DNA sequence 

Nucleic Acid Accession #: NM_004363.1 

Coding sequence: 115.. 2223 



l 
I 

CTCAGGGCAG 
TCCTGGAACT 
TCTCCCTCGG 
TCACTTCTAA 
TTCAATGTCG 
TTTGGCTACA 
GTAATAGGAA 
CCCAATGCAT 



11 
I 

AGGGAGGAAG 
CAAGCTCTTC 
CCCCTCCCCA 
CCTTCTGGAA 
CAGAGGGGAA 
GCTGGTACAA 
CTCAACAAGC 
CCCTGCTGAT 



21 
I 

GACAGCAGAC 
TCCACAGAGG 
CAGATGGTGC 
CCCGCCCACC 
GGAGGTGCTT 
AGGTGAAAGA 
TACCCCAGGG 
CCAGAACATC 



31 
I 

CAGACAGTCA 
AGGACAGAGC 
ATCCCCTGGC 
ACTGCCAAGC 
CTACTTGTCC 
GTGGATGGCA 
CCCGCATACA 
ATCCAGAATG 



41 

I 

CAGCAGCCTT 
AGACAGCAGA 
AGAGGCTCCT 
TCACTATTGA 
ACAATCTGCC 
ACCGTCAAAT 
GTGGTCGAGA 
ACACAGGATT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 



60 
120 
180 
240 



51 
I 

GACAAAACGT 
GACCATGGAG 
GCTCACAGCC 
ATCCACGCCG 
CCAGCATCTT 
TATAGGATAT 
GATAATATAC 
CTACACCCTA 



60 
120 
180 
240 
300 
360 
420 
480 



389 
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CACGTCATAA AGTCAGATCT TGTGAATGAA GAAGCAACTG GCCAGTTCCG GGTATACCCG 540 

QAGCTGCCCA AGCCCTCCAT CTCCAGCAAC AACTCCAAAC CCGTGGAGGA CAAGGATGCT 600 

GTGGCCTTCA CCTGTGAACC TGAGACTCAG GACGCAACCT ACCTGTGGTG GGTAAACAAT 660 

CAGAGCCTCC CGGTCAGTCC CAGGCTGCAG CTGTCCAATG GCAACAGGAC CCTCACTCTA 720 

TTCAATGTCA CAAGAAATGA CACAGCAAGC TACAAATGTG AAACCCAGAA CCCAGTGAGT 780 

GCCAGGCGCA GTGATTCAGT CATCCTGAAT GTCCTCTATG GCCOGGATGC CCCCACCATT 840 

TCCCCTCTAA ACACATCTTA CAGATCAGGG GAAAATCTGA ACCTCTCCTG CCACGCAGCC 900 

TCTAACCCAC CTGCACAGTA CTCTTGGTTT GTCAATGGGA CTTTCCAGCA ATCCACCCAA 960 

GAGCTCTTTA TCCCCAACAT CACTGTGAAT AATAGTGGAT CCTATACGTG CCAAGCCCAT 1020 

AACTCAGACA CTGGCCTCAA TAGGACCACA GTCACGACGA TCACAGTCTA TGCAGAGCCA 1080 

CCCAAACCCT TCATCACCAG CAACAACTCC AACCCCGTGG AGGATGAGGA TGCTGTAGCC 1140 

TTAACCTGTG AACCTGAGAT TCAGAACACA ACCTACCTGT GGTGGGTAAA TAATCAGAGC 1200 

CTCCCGGTCA GTCCCAGGCT GCAGCTGTCC AATGACAACA GGACCCTCAC TCTACTCAGT 1260 

GTCACAAGGA ATGATGTAGG ACCCTATGAG TGTGGAATCC AGAACGAATT AAGTGTTGAC 1320 

CACAGCGACC CAGTCATCCT GAATGTCCTC TATGGCCCAG ACGACCCCAC CATTTCCCCC 1380 

TCATACACCT ATTACCGTCC AGGGGTGAAC CTCAGCCTCT CCTGCCATGC AGCCTCTAAC 1440 

CCACCTGCAC AGTATTCTTG GCTGATTGAT GGGAACATCC AGCAACACAC ACAAGAGCTC 1500 

TTTATCTCCA ACATCACTGA GAAGAACAGC GGACTCTATA CCTGCCAGGC CAATAACTCA 1560 

GCCAGTGGCC ACAGCAGGAC TACAGTCAAG ACAATCACAG TCTCTGCGGA GCTGCCCAAG 1620 

CCCTCCATCT CCAGCAACAA CTCCAAACCC GTGGAGGACA AGGATGCTGT GGCCTTCACC 1680 

TGTGAACCTG AGGCTCAGAA CACAACCTAC CTGTGGTGGG TAAATGGTCA GAGCCTCCCA 1740 

GTCAGTCCCA GGCTGCAGCT GTCCAATGGC AACAGGACCC TCACTCTATT CAATGTCACA 1800 

AGAAATGACG CAAGAGCCTA TGTATGTGGA ATCCAGAACT CAGTGAGTGC AAACCGCAGT 1860 

GACCCAGTCA CCCTGGATGT CCTCTATGGG CCGGACACCC CCATCATTTC CCCCCCAGAC 1920 

TCGTCTTACC TTTCGGGAGC GAACCTCAAC CTCTCCTGCC ACTCGGCCTC TAACCCATCC 1980 

CCGCAGTATT CTTGGCGTAT CAATGGGATA CCGCAGCAAC ACACACAAGT TCTCTTTATC 2040 

GCCAAAATCA CGCCAAATAA TAACGGGACC TATGCCTGTT TTGTCTCTAA CTTGGCTACT 2100 

GGCCGCAATA ATTCCATAGT CAAGAGCATC ACAGTCTCTG CATCTGGAAC TTCTCCTGGT 2160 

CTCTCAGCTG GGGCCACTGT CGGCATCATG ATTGGAGTGC TGGTTGGGGT TGCTCTGATA 2220 

TAGCAGCCCT GGTGTAGTTT CTTCATTTCA GGAAGACTGA CAGTTGTTTT GCTTCTTCCT 2280 

TAAAGCATTT GCAACAGCTA CAGTCTAAAA TTGCTTCTTT ACCAAGGATA TTTACAGAAA 2340 

AGACTCTGAC CAGAGATCGA GACCATCCTA GCCAACATCG TGAAACCCCA TCTCTACTAA 2400 

AAATACAAAA ATGAGCTGGG CTTGGTGGCG CGCACCTGTA GTCCCAGTTA CTCGGGAGGC 2460 

TGAGGCAGGA GAATCGCTTG AACCCGGGAG GTGGAGATTG CAGTGAGCCC AGATCGCACC 2520 

ACTGCACTCC AGTCTGGCAA CAGAGCAAGA CTCCATCTCA AAAAGAAAAG AAAAGAAGAC 2580 

TCTGACCTGT ACTCTTGAAT ACAAGTTTCT GATACCACTG CACTGTCTGA GAATTTCCAA 2640 

AACTTTAATG AACTAACTGA CAGCTTCATG AAACTGTCCA CCAAGATCAA GCAGAGAAAA 2700 

TAATTAATTT CATGGGACTA AATGAACTAA TGAGGATTGC TGATTCTTTA AATGTCTTGT 2760 

TTCCCAGATT TCAGGAAACT TTTTTTCTTT TAAGCTATCC ACTCTTACAG CAATTTGATA 2820 

AAATATACTT TTGTGAACAA AAATTGAGAC ATTTACATTT TCTCCCTATG TGGTCGCTCC 2880 

AGACTTGGGA AACTATTCAT GAATATTTAT ATTGTATGGT AATATAGTTA TTGCACAAGT 2940 
TCAATAAAAA TCTGCTCTTT GTATAACAGA AAAA 

Seq ID NO: 533 Protein sequence 
Protein Accession Jh NP_004354.1 

1 11 21 31 41 51 

I I I I I I 

MESPSAPPHR WCIPWQRLLL TASLLTFWNP PTTAKbTIES TPFNVAEGKE VLLLVHNLPQ 60 

HLFGYSWYKG ERVDGNRQI I GYVIGTQQAT PGPAYSGREI IYPNASLLIQ NIIQNDTGPY 120 

TLHVIKSDLV NEEATGQFRV YPELPKPS1S SNNSKPVEDK DAVAFTCEPE TQDATYLWWV 180 

NNQSLPVSPR LQLSNGNRTL TLFNVTRNDT ASYKCBTQNP VSARRSDSVI LNVLYGPDAP 240 

TISPLNTSYR SGENLNLSCH AASNPPAQYS WFVNGTFQQS TQELFIPNIT VNNSGSYTCQ 300 

AHNSDTGLNR TTVTTITVYA EPPKPFITSN NSNPVEDEDA VALTCEPEIQ NTTYLWWVNN 360 

QSLPVSPRLQ LSNDNRTLTL LSVTRNDVGP YECGIQNELS VDHSDPVILN VLYGPDDPTI 420 

SPSYTYYRPG VNLSLSCHAA SNPPAQYSWL IDGNIQQHTQ ELFISNITEK NSGLYTCQAN 480 

NSASGHSRTT VKTITVSAEL PKPSISSNNS KPVEDKDAVA FTCEPEAQNT TYLWWVNGQS 540 

LPVSPRLQLS NGNRTLTLFN VTRNDARAYV CGIQNSVSAN RSDPVTLDVL YGPDTPIISP 600 

PDSSYLSGAN LNLSCHSASN PSPQYSWRIN GIPQQHTQVL FIAKITPNNN GTYACFVSNL 660 
ATGRNNSIVK SITVSASGTS PGLSAGATVG IMIGVLVGVA LI 

Seq ID NO: 534 DNA sequence 

Nucleic Acid Accession #: NMJ>06952.1 

Coding sequence: 11.. 793 

1 11 21 31 41 51 

III III 

AATCCCGACA ATGGCGAAAG ACAACTCAAC TGTTCGTTGC TTCCAGGGCC TGCTGATTTT 60 

TGGAAATGTG ATTATTGGTT GTTGCGGCAT TGCCCTGACT GCGGAGTGCA TCTTCTTTGT 120 

ATCTGACCAA CACAGCCTCT ACCCACTGCT TGAAGCCACC GACAACGATG ACATCTATGG 180 

GGCTGCCTGG ATCGGCATAT TTGTGGGCAT CTGCCTCTTC TGCCTGTCTG TTCTAGGCAT 240 

TGTAGGCATC ATGAAGTCCA GCAGGAAAAT TCTTCTGGCG TATTTCATTC TGATGTTTAT 300 

AGTATATGCC TTTGAAGTGG CATCTTGTAT CACAGCAGCA ACACAACGAG ACTTTTTCAC 360 

ACCCAACCTC TTCCTGAAGC AGATGCTAGA GAGGTACCAA AACAACAGCC CTCCAAACAA 420 

TGATGACCAG TGGAAAAACA ATGGAGTCAC CAAAACCTGG GACAGGCTCA TGCTCCAGGA 480 

CAATTGCTGT GGCGTAAATG GTCCATCAGA CTGGCAAAAA TACACATCTG CCTTCCGGAC 540 

TGAGAATAAT GATGCTGACT ATCCCTGGCC TCGTCAATGC TGTGTTATGA ACAATCTTAA 600 

AGAACCTCTC AACCTGGAGG CTTGTAAACT AGGCGTGCCT GGTTTTTATC ACAATCAGGG 660 

CTGCTATGAA CTGATCTCTG GTCCAATGAA CCGACACGCC TGGGGGGTTG CCTGGTTTGG 720 

ATTTGCCATT CTCTGCTGGA CTTTTTGGGT TCTCCTGGGT ACCATGTTCT ACTGGAGCAG 780 
AATTGAATAT TAAGAA 

Seq ID NO: 535 Protein sequence 
Protein Accession 8: NP_00BB83.1 

1 U 21 31 41 51 

I I I I I I 



390 



WO 02/086443 

MAKDNSTVRC FQGLLIFGNV IIGCCGIALT AECIFFVSDQ HSLYPLLEAT DNDDIYGAAW 60 

IGIFVGICLP CLSVUGIVGI MKSSRKILIA YFILMFIVYA FEVASCITAA TQRDFFTPNL 120 

FLKQMLERYQ NNSPPNNDDQ WKNNGVTKTW DRLMLQDNCC GVNGPSDWQK YTSAFRTENN 180 

DADYPWPRQC CVMNNLKEPL NLEACKLGVP GFYHNQGCYE LISGPMNRHA WGVAWFGFAI 240 
LCWTFWVLLG TMFYWSRIEY 

Seq ID NO: 536 DNA sequence 

Nucleic Acid Accession §: NM_002638.1 

Coding sequence: 120.. 473 

1 11 21 31 41 51 

I I I I I I 

CAATACAGCT AAGGAATTAT CCCTTGTAAA TACCACAGAC CCGCCCTGGA GCCAGGCCAA 60 

GCTGGACTGC ATAAAGATTG GTATGGCCTT AGCTCTTAGC CAAACACCTT CCTGACACCA 120 

TGAGGGCCAG CAG CTT C T TG ATCGTGGTGG TGTTCCTCAT CGCTGGGACG CTGGTTCTAG 180 

AGGCAGCTGT CACGGGAGTT CCTGTTAAAG GTCAAGACAC TGTCAAAGGC CGTGTTCCAT 240 

TCAATGGACA AGATCCCGTT AAAGGACAAG TTTCAGTTAA AGGTCAAGAT AAAGTCAAAG 30 0 

CGCAAGAGCC AGTCAAAGGT CCAGTCTCCA CTAAGCCTGG CTCCTGCCCC ATTATCTTGA 360 

TCCGGTGCGC CATGTTGAAT CCCCCTAACC GCTGCTTGAA AGATACTGAC TGCCCAGGAA 420 

TCAAGAAGTG CTGTGAAGGC TCTTGCGGGA TGGCCTGTTT GGTTCCCCAG TGAAGGGAGC 480 

CGGTCCTTGC TGCACCTGTG CCX3TCCCCAG AGCTACAGGC CCCATCTGGT CCTAAGTCCC 540 

TGCTGCCCTT CCCCTTCCCA CACTGTCCAT TCTTCCTCCC ATTCAGGATG CCCACGGCTG 600 
GAGCTGCCTC TCTCATCCAC TTTCCAATAA A 

Seq ID NO: 537 Protein sequence 
Protein Accession #: NP_002629.1 

1 11 21 31 41 51 

111 I .1 I 

MRASSFLIW VFLIAGTLVL EAAVTGVPVK GQDTVKGRVP FNGQDPVKGQ VSVKGQDKVK 60 
AQEPVKGPVS TKPGSCPIIL IRCAMLNPPN RCLKDTDCPG IKKCCEGSCG MACFVPQ 

Seq ID NO: 538 DNA sequence 

Nucleic Acid Accession ft: NM_001793. 2 

Coding sequence: 71.. 2560 

1 11 21 31 41 51 

I I I I I I 

AAAGGGGCAA GAGCTGAGCG GAACACCGGC CCGCCGTCGC GGCAGCTGCT TCACCCCTCT 60 

CTCTGCAGCC ATGGGGCTCC CTCGTGGACC TCTCGCGTCT CTCCTCCTTC TCCAGGTTTG 120 

CTGGCTGCAG TGCGCGGCCT CCGAGCCGTG CCGGGCGGTC TTCAGGGAGG CTGAAGTGAC 180 

CTTGGAGGCG GGAGGCGCGG AGCAGGAGCC CGGCCAGGCG CTGGGGAAAG TATTCATGGG 240 

CTGCCCTGGG CAAGAGCCAG CTCTGTTTAG CACTGATAAT GATGACTTCA CTGTGCGGAA 300 

TGGCGAGACA GTCCAGGAAA GAAGGTCACT GAAGGAAAGG AATCCATTGA AGATCTTCCC 360 

ATCCAAACGT ATCTTACGAA GACACAAGAG AGATTGGGTG GTTGCTCCAA TATCTGTCCC 420 

TGAAAATGGC AAGGGTCCCT TCCCCCAGAG ACTGAATCAG CTCAAGTCTA ATAAAGATAG 480 

AGACACCAAG ATTTTCTACA GCATCACGGG GCCGGGGGCA GACAGCCCCC CTGAGGGTGT 540 

CTTCGCTGTA GAGAAGGAGA CAGGCTGGTT GTTGTTGAAT AAGCCACTGG ACCGGGAGGA 600 

GATTGCCAAG TATGAGCTCT TTGGCCACGC TGTGTCAGAG AATGGTGCCT CAGTGGAGGA 660 

CCCCATGAAC ATCTCCATCA TCGTGACCGA CCAGAATGAC CACAAGCCCA AGTTTACCCA 720 

GGACACCTTC CGAGGGAGTG TCTTAGAGGG AGTCCTACCA GGTACTTCTG TGATGCAGGT 780 

GACAGCCACG GATGAGGATG ATGCCATCTA CACCTACAAT GGGGTGGTTG CTTACTCCAT 840 

CCATAGCCAA GAACCAAAGG ACCCACACGA CCTCATGTTC ACCATrCACC GGAGCACAGG 900 

CACCATCAGC GTCATCTCCA GTGGCCTGGA CCGGGAAAAA GTCCCTGAGT ACACACTGAC 960 

CATCCAGGCC ACAGACATGG ATGGGGACGG CTCCACCACC ACGGCAGTGG CAGTAGTGGA 1020 

GATCCTTGAT GCCAATGACA ATGCTCCCAT . GTTTGACCCC CAGAAGTACG AGGCCCATGT 1080 

GCCTGAGAAT GCAGTGGGCC ATGAGGTGCA GAGGCTGACG GTCACTGATC TGGACGCCCC 1140 

CAACTCACCA GCGTGGCGTG CCACCTACCT TATCATGGGC GGTGACGACG GGGACCATTT 1200 

TACCATCACC ACCCACCCTG AGAGCAACCA GGGCATCCTG ACAACCAGGA AGGGTTTGGA 1260 

TTTTGAGGCC AAAAACCAGC ACACCCTGTA CGTTGAAGTG ACCAACGAGG CCCCTTTTGT 1320 

GCTGAAGCTC CCAACCTCCA CAGCCACCAT AGTGGTCCAC GTGGAGGATG TGAATGAGGC 1380 

ACCTGTGTTT GTCCCACCCT CCAAAGTCGT TGAGGTCCAG GAGGGCATCC CCACTGGGGA 1440 

GCCTGTGTGT GTCTACACTG CAGAAGACCC TGACAAGGAG AATCAAAAGA TCAGCTACCG 1500 

CATCCTGAGA GACCCAGCAG GGTGGCTAGC CATGGACCCA GACAGTGGGC AGGTCACAGC 1560 

TGTGGGCACC CTCGACCGTG AGGATGAGCA GTTTGTGAGG AACAACATCT ATGAAGTCAT 1620 

GGTCTTGGCC ATGGACAATG GAAGCCCTCC CACCACTGGC ACGGGAACCC TTCTGCTAAC 1680 

ACTGATTGAT GTCAATGACC ATGGCCCAGT CCCTGAGCCC CGTCAGATCA CCATCTGCAA 1740 

CCAAAGCCCT GTGCGCCAGG TGCTGAACAT CACGGACAAG GACCTGTCTC CCCACACCTC 1800 

CCCTTTCCAG GCCCAGCTCA CAGATGACTC AGACATCTAC TGGACGGCAG AGGTCAACGA 1860 

GGAAGGTGAC ACAGTGGTCT TGTCCCTGAA GAAGTTCCTG AAGCAGGATA CATATGACGT 1920 

GCACCTTTCT CTGTCTGACC ATGGCAACAA AGAGCAGCTG ACGGTGATCA GGGCCACTGT 1980 

GTGCGACTGC CATGGCCATG TCGAAACCTG CCCTGGACCC TGGAAGGGAG GTTTCATCCT 2040 

CCCTGTGCTG GGGGCTGTCC TGGCTCTGCT GTTCCTCCTG CTGGTGCTGC TTTTGTTGGT 2100 

GAGAAAGAAG CGGAAGATCA AGGAGCCCCT CCTACTCCCA GAAGATGACA CCCGTGACAA 2160 

CGTCTTCTAC TATGGCGAAG AGGGGGGTGG CGAAGAGGAC CAGGACTATG ACATCACCCA 2220 

GCTCCACOGA GGTCTGGAGG CCAGGCCGGA GGTGGTTCTC CGCAATGACG TGGCACCAAC 2280 

CATCATCCCG ACACCCATGT ACCGTCCTCG GCCAGCCAAC CCAGATGAAA TCGGCAACTT 2340 

TATAATTGAG AACCTGAAGG CGGCTAACAC AGACCCCACA GCCCCGCCCT ACGACACCCT 2400 

CTTGGTGTTC GACTATGAGG GCAGCGGCTC CGACGCCGCG TCCCTGAGCT CCCTCACCTC 2460 

CTCCGCCTCC GACCAAGACC AAGATTACGA TTATCTGAAC GAGTGGGGCA GCCGCTTCAA 2520 

GAAGCTGGCA GACATGTACG GTGGCGGGGA GGACGACTAG GCGGCCTGCC TGCAGGGCTG 2580 

GGGACCAAAC GTCAGGCCAC AGAGCATCTC CAAGGGGTCT CAGTTCCCCC TTCAGCTGAG 2640 

GACTTCGGAG CTTGTCAGGA AGTGGCCGTA GCAACTTGGC GGAGACAGGC TATGAGTCTG 2700 

ACGTTAGAGT GGTTGCTTCC TTAGCCTTTC AGGATGGAGG AATGTGGGCA GTTTGACTTC 2760 

AGCACTGAAA ACCTCTCCAC CTGGGCCAGG GTTGCCTCAG AGGCCAAGTT TCCAGAAGCC 2820 

TCTTACCTGC CGTAAAATGC TCAACCCTGT GTCCTGGGCC TGGGCCTGCT GTGACTGACC 2880 

TACAGTGGAC TTTCTCTCTG GAATGGAACC TTCTTAGGCC TCCTGGTGCA ACTTAATTTT 2940 



391 
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TTTTTTTAAT GCTATCTTCA AAACGTTAGA GAAAGTTCTT CAAAAGTGCA GCCCAGAGCT 
GCTGGGCCCA CTGGCCGTCC TGCATTTCTG GTTTCCAGAC CCCAATGCCT CCCATTCGGA 
TGGATCTCTG OGT T TT TA TA CTGAGTGTGC CTAGGTTGCC CCTTATTTTT TATTTTCCCT 
GTTGCGTTGC TATAGATGAA GGGTGAGGAC AATCGTGTAT ATGTACTAGA ACTTTTTTAT 
TAAAGAAACT TTTCCCAGAA AAAAA 



PCTAJS02/12476 



Seq ID NO: 539 Protein sequence 
Protein Accession #: NP_001784.2 



MGLPRGPLAS 
QEPALFSTDN 
KGPFPQRLNQ 
YELPGHAVSB 
DEDDAIYTYN 
TDMDGDGSTT 
AWRATYLIMG 
PTSTATIWH 
DPAGWLAMDP 
VNDHGPVPEP 
TWLSLKKFL 
GAVLALLFLL 
GLEARPBWL 
DYEGSGSDAA 



11 

I 

LLLJjQVCWLQ 
DDFTVRNGET 
LKSNXDRDTK 
NGASVEDPMN 
GWAYSIHSQ 
TAVAWEILD 
GODGDHFTIT 
VEDVKEAPVF 
DSGQVTAVGT 
RQITICNQSP 
KQDTYDVHLS 
LVLLLLVRKK 
RNDVAPTIIP 
SLSSLTSSAS 



21 
I 

CAASEPCRAV 
VQERRSLKER 
IFYSITGPGA 
ISIIVTDQND 
EPKDPHDLMF 
ANDNAPMFDP 
THPESNQGIL 
VPPSKWEVQ 
LDREDEQFVR 
VRQVLNITDK 
LSDHGNKEO.L 
RKIKEPLLLP 
TPMYRPRPAN 
DQDQDYDYLN 



31 
I 

FREAEVTLEA 
NPLKIFPSKR 
DSPPEGVFAV 
HKPKFTQDTP 
TIHRSTGTIS 
QKYEAHVPEN 
TTRKGLDFEA 
EGIPTGEPVC 
NNIYEVMVLA 
DLSPHTSPFQ 
TVIRATVCDC 
EDDTRDNVFY 
PDEIGNPIIB 
EWGSRFKKLA 



.Seq ID NO: 540 DNA sequence . 
Nucleic Acid Accession #s Eos sequence 
Coding sequence: 1..672 



1 
I 

ATGAGGCTCC 
CGGGGCTCCC 
AAGGGCGGGG 
CTGCTCGCCT 
GCGAGACAAC 
TGTCATGTTT 
ACAGAGCCAT 
AAGCAGTGCT 
CTCCTGGAAG 
TTAGAGGGGC 
AGCTGTGGTG 
AGCCTGTCTT 



11 
I 

AAAGACCCCG 
CCTACCGGCC 
AGGGGGCGCC 
TGCTGCTGGT 
GAGATCCAGA 
GTGAGAGAGA 
ACTGCGTTAT 
CCGCTGGTTG 
AGCCCATGCC 
CACCTATCAA 
GGCTGTGGCT 
GA 



21 

I 

ACAGGCCCCG 
AGACCCGGGG 
GCGCGCTGAC 
CGTGGCCCTA 
GGACTCCCAG 
AAACACTTTC 
AGCGGCCGTG 
TGCAGCGATG 
CTTCTTTTAC 
CTCATCAGTG 
GGCCATCCTC 



31 
I 

GCGGGTGGGA 
AGAGGCGCGC 
CCTCCCTGGG 
CCGCGGGTGT 
CGAACGGACG 
GAGTGCCAGA 
AAAATATTTC 
GAGAGACCCA 
CTCAAGTGTT 
TTCAAAGAAT 
CTGCTGCTGG 



41 
I 

GGAEQEPGQA 
ILRRHKRDWV 
EKETGWLLLN 
RGSVLEGVLP 
VISSGLDREK 
AVGHEVQRLT 
KNQHTLYVEV 
VYTAEDPDKE 
MDNGSPPTTG 
AQLTDDSDIY 
HGHVETCPGP 
YGEEGGGEED 
NLKAANTDPT 
DMYGGGEDD 



41 
I 

GGCGCGCGCC 
GGAGGCTGCG 
CACCGCTGGG 
GGACAGACGC 
AGGGTGACAA 
ACCCAAGGAG 
CACGTTTTTT 
AGCCAGAGGA 
GTAAAATTCG 
ATGCTGGGAG 
CCTCCATTGC 



51 
I 

LGKVFMGCPG 
VAPISVPENG 
KPLDREEIAK 
GTSVMQVTAT 
VPEYTLTIQA 
VTDLDAPNSP 
TNEAPFVLKL 
NQKISYRILR 
TGTLLLTLID 
WTAEVNEEGD 
WKGGPILPVL 
QDYDITQLHR 
APPYDTLLVF 



51 
I 

CCGGGGCGGG 
AAGGTTCCAG 
GACGATGGCG 
CAACCTGACT 
TAGAGTGTGG 
GTGCAAATGG 
CATGGTTGCG 
GAAGCGGTTT 
CTACTGCAAT 
CATGGGTGAG 
AGCCGGCCTC 



Seq ID NO: 541 Protein sequence 
Protein Accession #: Eos sequence 

1 11 



31 



41 



51 



21 

I I I I I I 

MRLQRPRQAP AGGRRAPRGG RGSPYRPDPG RGARRLRRFQ KGGEGAPRAD PPWAPLGTMA 
LLALLLWAL PRVWTDANLT ARQRDPEDSQ RTDEGDNRVW CHVCERENTF ECQNPRRCKW 
TEPYCVIAAV KIFPRFFMVA KQCSAGCAAM ERPKPEEKRF LLEEPMPFFY LKCCXIRYCN 
LEGPPINSSV FKEYAGSMGE SCGGLWLAIL LLLASIAAGL SLS 

Seq ID NO: S42 DNA sequence 

Nucleic Acid Accession ft: XMJJ35292.2 

Coding sequence: 53.. 1576 



1 
I 

GCTCGCTGGG 
TGCGGGCCCG 
GGAGAAGATG 
CGTGACCCTG 
TATCGGCTCG 
GCTGGCGCTG 
CGCGGAGCTC 
CTACGGCTCG 
ATCGCAGTAC 
CTGCCCGGTG 
GGCCGTGAAC 
CAAGCTCCTG 
TGTGTCCAAT 
TGTGCTGGCA 
CACAGAGGAA 
CATCGTGACG 
GCAGATGCTG 
GTCCTGGATC 
GTTCACATCC 
CTCCATGATC 
GACGCTGCTC 
CAACTGGCTC 
TGAGCTTGAG 
CCTCTTCCTG 
CATCATCCTC 
GTGGCTCCTC 



11 

I 

CCGCGGCTCC 
AAGCGGCGCG 
CTGGCCGCCA 
CAGCGGAACA 
GGCATCTTCG 
GTGGTGTGGG 
GGCACCACCA 
CTGCCCGCCT 
ATCGTGGCCC 
CCCGAGGAGG 
TGCTACAGCG 
GCCCTGGCCC 
CTAGATCCCA 
TTATACAGCG 
ATGATCAACC 
CTGGTGTACG 
TCGTCCGAGG 
ATCCCCGTCT 
TCCAGGCTCT 
CACCCACAGC 
TACGCCTTCT 
TGCGTGGCCC 
CGGCCCATCA 
ATCGCCGTCT 
AGCGGGCTGC 
CAGGGCATCT 



21 

I 

CGGGTGTCCC 
CGCTAGCGGC 
AGAGCGCGGA 
TCACGCTGCT 
TGACGCCCAC 
CCGCGTGCGG 
TCTCCAAATC 
TCCTCAAGCT 
TGGTCTTCGC 
CAGCCAAGCT 
TGAAGGCCGC 
TGATCATCCT 
ACTTCTCATT 
GCCTCTTTGC 
CCTACAGAAA 
TGCTGACCAA 
CCGTGGCCGT 
TCGTGGGCCT 
TCTTCGTGGG 
TCCTCACCCC 
CCAAGGACAT 
TGGCCATCAT 
AGGTGAACCT 
CCTTCTGGAA 
CCGTCTACTT 
TCTCCACGAC 



31 
I 

AGGCCCGGCC 
GCCGGCGGCC 
CGGCTCGGCG 
CAACGGCGTG 
GGGCGTGCTC 
CGTCTTCTCC 
GGGCGGCGAC 
CTGGATCGAG 
CACCTACCTG 
CGTGGCCTGC 
CACCCGGGTC 
GCTGGGCTTC 
TGAAGGCACC 
CTATGGAGGA 
CCTGCCCCTG 
CCTGGCCTAC 
GGACTTCGGG 
GTCCTGCTTC 
GTCCCGGGAA 
CGTGCCGTCC 
CTTCTCCGTC 
CGGCATGATC 
GGCCCTGCCT 
GACACCCGTG 
CTTCGGGGTC 
CGTCCTGTGT 



41 

I 

GGTGCGCAGA 
GAGGAGAAGG 
CCGGCAGGCG 
GCCATCATCG 
AAGGAGGCAG 
ATCGTGGGCG 
TACGCCTACA 
CTGCTCATCA 
CTCAAGCCGC 
CTCTGCGTGC 
CAGGATGCCT 
GTCCAGATCG 
AAACTGGATG 
TGGAATTACT 
GCCATCATCA 
TTCACCACCC 
AACTATCACC 
GGCTCCGTCA 
GGCCACCTGC 
CTCGTGTTCA 
ATCAACTTCT 
TGGCTGCGCC 
GTGTTCTTCA 
GAGTGTGGCA 
TGGTGGAAAA 
CAGAAGCTCA 



3000 
3060 
3120 
3180 



51 

I 

GCATGGCGGG 
AAGAGGCGCG 
AGGGCGAGGG 
TGGGGACCAT 
GCTCGCCGGG 
CGCTCTGCTA 
TGCTGGAGGT 
TCCGGCCTTC 
TCTTCCCCAC 
TGCTGCTCAC 
TTGCCGCCGC 
GAAAGGGTGA 
TGGGGAACAT 
TGAATTTCGT 
TCTCCCTGCC 
TGTCCACCGA 
TGGGCGTCAT 
ATGGGTCCCT 
CCTCCATCCT 
CGTGTGTGAT 
TCAGCTTCTT 
ACAGAAAGCC 
TCCTGGCCTG 
TCGGCTTCAC 
ACAAGCCCAA 
TGCAGGTGGT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
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CCCCCAGGAG ACATAGCCAG GAGGCCGAGT GGCTGCCGGA GGAGCATGC 



PCT/US02/12476 



Seq ID NO: 543 Protein sequence 
^ Protein Accession &: XP_035292.2 

1 11 21 31 41 51 

I I I I I I 

MAGAGPKRRA LAAPAAEBKE EAREKMLAAK SADGSAPAGE GEGVTLQRNI TLLNGVAI IV 60 

GTIIGSGIFV TPTGVLKEAG SPGLALWWA ACGVFSIVGA LCYAELGTTI SKSGGDYAYM 120 

10 LEVYGSLPAF LKLWIELLII RPSSQYIVAL VFATYLLKPL PPTCPVPEEA AKLVACLCVL 180 

LLTAVNCYSV KAATRVQDAF AAAKLLALAL IILLGFVQIG KGDVSNLDPN FSPEGTKLDV 240 

GNIVLALYSG LFAYGGWNYL NFVTEEMINP YRNLPLAIII SLPIVTLVYV LTNLAYFTTL 300 

STEQMLSSEA VAVDFGNYHL GVMSWIIPVF VGLSCFGSVN GSLFTSSRLF FVGSREGHLP 360 

S1LSMIHPQL LTPVPSLVFT CVMTLLYAFS KDIFSVINPF SFFNWLCVAL AIIGMIWLRH 420 

15 RKPELERPIK VNLALPVFFI LACLFLIAVS FWKTPVECGI GFTIILSGLP VYFFGVWWKN 480 
KPKWIiLOGIF STTVLCQKLM QWPQET 

Seq ID NO: 544 DNA sequence 
Nucleic Acid Accession 8: NM_005268.1 
20 Coding sequence: 168.. 989 

1 11 21 31 41 51 

I I I I I I ' M 

TAAAAAGCAA AAGAATTCGC GGCCGCGTCG ACACGGGCTT CCCCGAAAAC CTTCCCCGCT 60 

25 TCTGGATATG AAATTCAAGC TGCTTGCTGA GTCCTATTGC CGGCTGCTGG GAGCCAGGAG 120 

AGCCCTGAGG AGTAGTCACT CAGTAGCAGC TGACGCGTGG GTCCACCATG AACTGGAGTA 180 

. TCTTTGAGGG ACTCCTGAGT GGGGTCAACA AGTACTCCAC AGCCTTTGGG CGCATCTGGC 240 

TGTCTCTGGT CTTCATCTTC CGCGTGCTGG TGTACCTGGT GACGGCCGAG CGTGTGTGGA 300 

GTGATGACCA CAAGGACTTC GACTGCAATA CTCGCCAGCC CGGCTGCTCC AACGTCTGCT 360 

30 TTGATGAGTT CTTCCCTGTG TCCCATGTGC GCCTCTGGGC CCTGCAGCTT ATCCTGGTGA 420 

CATGCCCCTC ACTGCTCGTG GTCATGCACG TGGCCTACCG GGAGGTTCAG GAGAAGAGGC 480 

ACCGAGAAGC CCATGGGGAG AACAGTGGGC GCCTCTACCT GAACCCCGGC AAGAAGCGGG 540 

GTGGGCTCTG GTGGACATAT GTCTGCAGCC TAGTGTTCAA GGCGAGCGTG GACATCGCCT 600 

TTCTCTATGT GTTCCACTCA TTCTACCCCA AATATATCCT CCCTCCTGTG GTCAAGTGCC 660 

35 ACGCAGATCC ATGTCCCAAT ATAGTGGACT GCTTCATCTC CAAGCCCTCA GAGAAGAACA 720 

TTTTCACCCT CTTCATGGTG GCCACAGCTG CCATCTGCAT CCTGCTCAAC CTCGTGGAGC 780 

TCATCTACCT GGTGAGCAAG AGATGCCACG AGTGCCTGGC AGCAAGGAAA GCTCAAGCCA 840 

TGTGCACAGG TCATCACCCC CACGGTACCA CCTCTTCCTG CAAACAAGAC GACCTCCTTT 900 

CGGGTGACCT CATCTTTCTG GGCTCAGACA GTCATCCTCC TCTCTTACCA GACCGCCCCC 960 

40 GAGACCATGT GAAGAAAACC ATCTTGTGAG GGGCTGCCTG GACTGGTCTG GCAGGTTGGG 1020 

CCTGGATGGG GAGGCTCTAG CATCTCTCAT AGGTGCAACC TGAGAGTGGG GGAGCTAAGC 1080 

CATGAGGTAG GGGCAGGCAA GAGAGAGGAT TCAGACGCTC TGGGAGCCAG TTCCTAGTCC 1140 
TCAACTCCAG CCACCTGCCC CAGCTCGACG GCACTGGGCC AGTTCCCCCT CTGCTCTGCA 
GCTCGGTTTC CTTTTCTAGA ATGGAAATAG TGAGGGCCAA TGC 



45 



60 



80 



1200 



Seq ID NO: 545 Protein sequence 
Protein Accession #: NP_0 052 59.1 



1 11 21 31 41 51 

50 | | I ! I I 

MNWSIFEGLL SGVNKYSTAF GRIWLSLVFI FRVLVYLVTA ERVWSDDHKD FDCNTRQPGC 60 

SNVCFDEFFP VSHVRLWALQ LILVTCPSLL WMHVAYREV QEKRHREAHG ENSGRLYLNP 120 

GKKRGGLWWT YVCSLVFKAS VDIAFLYVFH SFYPKYILPP WKCHADPCP NIVDCFISKP 180 

SEKNIFTLFM VAT AAI GILL NLVELIYLVS KRCHECLAAR KAQAMCTGHH PHGTTSSCKQ 240 
55 DDLLSGDLIP LGSDSHPPLIi PDRPRDHVKK TIL 



Seq ID NO: 546 DNA sequence 

Nucleic Acid Accession tt: NM_002391. 

Coding sequence: 26.. 457 



11 21 31 41 51 

I I I I I I 

CGGGCGAAGC AGCGCGGGCA GCGAGATGCA GCACCGAGGC TTCCTCCTCC TCACCCTCCT 60 

CGCCCTGCTG GCGCTCACCT CCGCGGTCGC CAAAAAGAAA GATAAGGTGA AGAAGGGCGG 120 

65 CCCGGGGAGC GAGTGCGCTG AGTGGGCCTG GGGGCCCTGC ACCCCCAGCA GCAAGGATTG 180 

CGGCGTGGGT TTCCGCGAGG GCACCTGCGG GGCCCAGACC CAGCGCATCC GGTGCAGGGT 240 

GCCCTGCAAC TGGAAGAAGG AGTTTGGAGC CGACTGCAAG TACAAGTTTG AGAACTGGGG 300 

TGCGTGTGAT GGGGGCACAG GCACCAAAGT CCGCCAAGGC ACCCTGAAGA AGGCGCGCTA 360 

CAATGCTCAG TGCCAGGAGA CCATCCGCGT CACCAAGCCC TGCACCCCCA AGACCAAAGC 420 

70 AAAGGCCAAA GCCAAGAAAG GGAAGGGAAA GGACTAGACG CCAAGCCTGG ATGCCAAGGA 480 

GCCCCTGGTG TCACATGGGG CCTGGCCACG CCCTCCCTCT CCCAGGCCCG AGATGTGACC 540 

CACCAGTGCC TTCTGTCTGC TCGTTAGCTT TAATCAATCA TGCCCTGCCT TGTCCCTCTC 600 

ACTCCCCAGC CCCACCCCTA AGTGCCCAAA GTGGGQAGGQ ACAAGGGATT CTGGGAAGCT 660 

TGAGCCTCCC CCAAAGCAAT GTGAGTCCCA GAGCCCGCTT TTGTTCTTCC CCAC AATTCC 720 

75 ATTACTAAGA AACACATCAA ATAAACTGAC TTTTTCCCCC CAATAAAAGC TCTTCTTTTT 780 
TAATAT 



Seq ID NO: 547 Protein sequence 
Protein Accession 8: NPJ>02382.1 



11 21 31 41 ' 51 

I I I I 1 I 

MQHRGFLLLT LLALLALTSA VAKKKDKVKK GGPGSECAEW AWGPCTPSSK DCGVGFREGT 60 
CGAQTQRIRC RVPCNWKKEF GADCKYKFEN WGACDGGTGT KVRQGTLKKA RYNAOCQETI 120 
85 RVTKPCTPKT KAKAKAKKGK GKD 



Seq ID NO j 548 DNA sequence 
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WO 02/086443 

Nucleic Acid Accession ft: 
Coding sequence! 1..786 



PCT/US02/12476 



NM 006783.1 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



I 



ATG6ATTGGG 
GGGAAGGTGT 
CAGGAAGTGT 
AAAAATGTGT 
CTGATCTTCG 
GAAACCACTC 
ATTAAAAAGC 
TTTTTCCGAA 
TACCACCTGC 
TTTATTTCTA 
ATTTGCATGC 
AGATCAAAGA 
CAGAATGAAA 
AGCTAA 



11 

I 

GGACGCTGCA 
GGATCACAGT 
GGGGTGACGA 
GCTATGACCA 
TCTCCACCCC 
GCAAGTTCAG 
ACAAGGTTCG 
TCATCTTTGA 
CCTGGGTGTT 
GGCCAACAGA 
TGCTTAACGT 
GAGCACAGAC 
TGAATGAGCT 



21 
I 

CACTTTCATC 
CATCTTTATT 
GCAAGAGGAC 
CTTTTTCCCG 
AGCGCTGCTG 
CCGAGCAGAG 
GATAGAGGGG 
AGCAGCCTTT 
GAAATGTGGG 
GAAGACCGTG 
GGCAGAGTTG 
GCAAAAAAAT 
GATTTCAGAT 



31 
I 

GGGGGTGTCA 
TTCCGAGTCA 
TTCGTCTGCA 
GTGTCCCACA 
GTGGCCATGC 
AAGAGGAATG 
TCGCTGTGGT 
ATGTATGTGT 
ATTGACCCCT 
TTTACCATTT 
TGCTACCTGC 
CACCCCAATC 
AGTGGTCAAA 



41 

i 

ACAAACACTC 
TGATCCTAGT 
ACACACTGCA 
TCCGGCTGTG 
ATGTGGCCTA 
ATTTCAAAGA 
GGACGTACAC 
TTTACTTCCT 
GCCCCAACCT 
TTATGATTTC 
TGCTGAAAGT 
ATGCCCTAAA 
ATGCAATCAC 



51 
I 

CACCAGCATC 
GGTGGCTGCC 
ACCGGGATGC 
GGCCCTCCAG 
CTACAGGCAC 
CATAGAGGAC 
CAGCAGCATC 
TTACAATGGG 
TGTTGACTGC 
TGCGTCTGTG 
GTGTTTTAGG 
GGAGAGTAAG 
AGGTTTCCCA 



Seq ID NO: 549 Protein sequence 
Protein Accession ft: NP_006774.1 

1 11 21 31 41 51 

I I I I I I 

MDWGTLHTFI GGVNKHSTSI GXVWITVIFI FRVMILWAA QEVWGDEQED FVCNTLQPGC 
KNVCYDHFFP VSHIRLWALQ. LIFVSTPALL VAMHVAYYRH ETTRKFRRGE KRNDFKDIED 
IKKHKVRIEG SLWWTYTSSI FFRIIFEAAF MYVFYFLYNG YHLPWVLKCG IDPCPNLVDC 
FISRPTEKTV FTIFMISASV ICMLLNVAEL CYLLLKVCFR RSKRAQTQKN HPNHALKESK 
QNEMNELISD SGQNAITGFP S 

Seq ID NO: 550 DNA sequence 

Nucleic Acid Accession ft: NM_00257l.l 

Coding sequence: 99.. 587 



i 

CATCCCTCTG 
TCACCCTGGG 
AGGACCTGGA 
ACATCTCCCT 
CCACCCCCGA 
AGAAGAAGGT 
TGGCGAACGA 
AGGACACCAC 
AGGACGATGA 
GGTACTTGCT 
CCAGGAAGAC 
TTTCAAAGAA 
TCCTGCTGCA 
GCAGAGGTTA 



11 

I 

GCTCCAGAGC 
CGTGGCCCTG 
GCTCCCAAAG 
CATGGCGACA 
GGACAACCTG 
CCTTGGAGAG 
GGCCACGCTG 
CACCCCCATC 
GATCATGCAG 
GGACTTGAAA 
CAGACTCCCA 
TAACCACAGC 
CACCTGCACC 
TTAATAAACC 



21 

I 

TCAGAGCCAC 
GTCTGTGGTG 
TTGGCAGGGA 
CTGAAGGCCC 
GAGATCGTTC 
AAGACTGGGA 
CTCGATACTG 
CAGAGCATGA 
GGATTCATCA 
CAGATGGAAG 
CCCTTCCACA 
TCAGAAGACG 
ATTGCCATGG 
CTTGGAGCAT 



31 

I 

CCACAGCCGC 
TCCCGGCCAT 
CCTGGCACTC 
CTCTGAGGGT 
TGCACAGATG 
ATCCAAAGAA 
ACTACGACAA 
TGTGCCAGTA 
GGGCTTTCAG 
AGCCGTGCCG 
CCTCCAGAGC 
ATGACGTGGT 
GGAGGCTGCT 
G 



41 
I 

AGCCATGCTG 
GGACATCCCC 
CATGGCCATG 
CCACATCACC 
GGAGAACAAC 
GTTCAAGATC 
TTTCCTGTTT 
CCTGGCCAGA 
GCCCCTGCCC 
TTTCTAGCTC 
AGTGGGACTT 
CATCTGTGTC 
CCCTGGGGGC 



51 
I 

TGCCTCCTGC 
CAGACCAAGC 
GCGACCAACA 
TCACTGTTGC 
AGCTGTGTTG 
AACTATAQGG 
CTCTGCCTAC 
GTCCTGGTGG 
AGGCACCTAT 
ACCTCCGCCT 
CCTCCTGCCC 
GCCATCCCCT 
AGAGTCTCTG 



Seq ID NO: 551 Protein sequence 
Protein Accession ft: NP_002562.1 

1 11 21 31 41 51 

I I I I I \ 

MDIPQTKQDL ELPKLAGTWH SMAMATNNIS LMATLKAPLR VHITSLLPTP EDNLEIVLHR 
WENNSCVEKK VLGEKTGNPK KFKINYTVAN EATLLDTDYD NFLFLCLQDT TTPIQSMMCQ 
YLARVLVEDD EIMQGFIRAF RPLPRHLWYL LDLKQMEEPC RF 

Seq ID NO: 552 DNA sequence 

Nucleic Acid Accession ft: NM_006500.1 

Coding sequence: 2 7.. 1967 



ACTTGCGTCT 
TCGCCGCCTG 
CGCCTGAGCT 
AGTCCCAAGG 
TCATCTTCCG 
TCAGCCTCCA 
GCATCTTCTT 
TCTACAAAGC 
GTAAGGAGCC 
TCATCTGGTA 
CGTCCCAGAC 
TGGTTAAAGA 
GGAACCACAT 
TGTGGCTGGA 
GTTTGGCTGA 
GGGAGGCAGA 
AGGAACACAG 
TGAGTGAACC 
* CCCCTGAGAG 
ACCTCGAGTT 



11 
I 

CGCCCTCCGG 
CTGCTGCTGT 
GGTGGAGGTG 
CAACCTCAGC 
TGTGCGCCAG 
GGACAGAGGG 
GTGCCAGGGC 
TCCGGAGGAG 
TGAGGAGGTC 
CAAGAATGGC 
TGTGGAGTCG 
AGACAAAGAT 
GAAGGAGTCC 
AGTGGAGCCC 
TGGCAACCCT 
GGAAGAGACA 
TGGGCGCTAT 
ACAGGAACTA 
ACAGGAAGGC 
CCAGTGGCTG 



21 
I 

CCAAGCATGG 
CCTCGCGTCG 
GAAGTGGGCA 
CATGTCGACT 
GGCCAGGGCC 
GCTACTCTGG 
AAGCGCCCTC 
CCAAACATCC 
GCTACCTGTG 
CGGCCTCTGA 
AGTGGTTTGT 
GCCCAGTTTT 
AGGGAAGTCA 
GTGGGAATGC 
CCACCACACT 
ACCAACGACA 
GAATGTCAGG 
CTGGTGAACT 
AGCAGCCTCA 
AGAGAAGAGA 



31 
I 

GGCTTCCCAG 
CGGGTGTGCC 
GCACAGCCCT 
GGTTTTCTGT 
AGAGCGAACC 
CCCTGACTCA 
GGTCCCAGGA 
AGGTCAACCC 
TAGGGAGGAA 
AGGAGGAGAA 
ACACCTTGCA 
ACTGTGAGCT 
CCGTCCCTGT 
TGAAGGAAGG 
TCAGCATCAG 
ACGGGGTCCT 
CCTGGAACTT 
ATGTGTCTGA 
CCCTGACCTG 
CAGACCAGGT 



41 
I 

GCTGGTCTGC 
CGGAGAGGCT 
TCTGAAGTGC 
CCACAAGGAG 
TGGGGAGTAC 
AGTCACCCCC 
GTACCGCATC 
CCTGGGCATC 
CGGGTACCCC 
GAACCGGGTC 
GAGTATTCTG 
CAACTACCGG 
TTTCTACCCG 
GGACCGCGTG 
CAAGCAGAAC 
GGTGCTGGAG 
GGACACCATG 
CGTCCGAGTG 
TGAGGCAGAG 
GCTGGAAAGG 



51 

I 

GCCTTCTTGC 
GAGCAGCCTG 
GGCCTCTCCC 
AAGCGGACGC 
GAGCAGCGGC 
CAAGACGAGC 
CAGCTCCGCG 
CCTGTGAACA 
ATTCCTCAAG 
CACATTCAGT 
AAGGCACAGC 
CTGCCCAGTG 
ACAGAAAAAG 
GAAATCAGGT 
CCCAGCACCA 
CCTGCCCGGA 
ATATCGCTGC 
AGTCCCGCAG 
AGTAGCCAGG 
GGGCCTGTGC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
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TTCAGTTGCA TGACCTGAAA CGGGAGGCAG GAGGCGGCTA TCGCTGCGTG GCGTCTGTGC 1260 

CCAGCATACC CGGCCTGAAC CGCACACAGC TGGTCAAGCT GGCCATTTTT GGCCCCCCTT 1320 

GGATGGCATT CAAGGAGAGG AAGGTGTGGG TGAAAGAGAA TATGGTGTTG AATCTGTCTT 1380 

GTGAAGCGTC AGGGCACCCC CGGCCCACCA TCTCCTGGAA CGTCAACGGC ACGGCAAGTG 1440 

AACAAGACCA AGATCCACAG CGAGTCCTGA GCACCCTGAA TGTCCTCGTG ACCCCGGAGC 1500 

TGTTGGAGAC AGGTGTTGAA TGCACGGCCT CCAACGACCT GGGCAAAAAC ACCAGCATCC 1560 

TCTTCCTGGA GCTGGTCAAT TTAACCACCC TCACACCAGA CTCCAACACA ACCACTGGCC 1620 

TCAGCACTTC CACTGCCAGT CCTCATACCA GAGCCAACAG CACCTCCACA GAGAGAAAGC 1680 

TGCCGGAGCC GGAGAGCCGG GGCGTGGTCA TCGTGGCTGT GATTGTGTGC ATCCTGGTCC 1740 

TGGCGGTGCT GGGCGCTGTC CTCTATTTCC TCTATAAGAA GGGCAAGCTG CCGTGCAGGC 1800 

GCTCAGGGAA GCAGGAGATC ACGCTGCCCC CGTCTCGTAA GACCGAACTT GTAGTTGAAG 1860 

TTAAGTCAGA TAAGCTCCCA GAAGAGATGG GCCTCCTGCA GGGCAGCAGC GGTGACAAGA 1920 

GGGCTCCGGG AGACCAGGGA GAGAAATACA TCGATCTGAG GCATTAGCCC CGAATCACTT 1980 

CAGCTCCCTT CCCTGCCTGG ACCATTCCCA GCTCCCTGCT CACTCTTCTC TCAGCCAAAG 2040 

CCTCCAAAGG GACTAGAGAG AAGCCTCCTG CTCCCCTCAC CTGCACACCC CCTTTCAGAG 2100 

GGCCACTGGG TTAGGACCTG AGGACCTCAC TTGGCCCTGC AAGCCGCTTT TCAGGGACCA 2160 

GTCCACCACC ATCTCCTCCA CGTTGAGTGA AGCTCATCCC AAGCAAGGAG CCCCAGTCTC 2220 

CCGAGCGGGT AGGAGAGTTT CTTGCAGAAC GTGTTTTTTC TTTACACACA TTATGGCTGT 2280 

AAATACCTGG CTCCTGCCAG CAGCTGAGCT GGGTAGCCTC TCTGAGCTGG TTTCCTGCCC 2340 

CAAAGGCTGG CTTCCACCAT CCAGGTGCAC CACTGAAGTG AGGACACACC GGAGCCAGGC 2400 

GCCTGCTCAT GTTGAAGTGC GCTGTTCACA CCCGCTCCGG AGAGCACCCC AGCGGCATCC 2460 

AGAAGCAGCT GCAGTGTTGC TGCCACCACC CTCCTGCTCG CCTCTTCAAA GTCTCCTGTG 2520 

ACATTTTTTC TTTGGTCAGA AGCCAGGAAC TGGTGTCATT CCTTAAAAGA TACGTGCCGG 2580 

GGCCAGGTGT GGTGGCTCAC GCCTGTAATC CCAGCACTTT GGGAGGCCGA GGCGGGCGGA 2640 

TCACAAAGTC AGGACGAGAC CATCCTGGCT AACACGGTGA AACCCTGTCT CTACTAAAAA 2700 

TACAAAAAAA AATTAGCTAG GCGTAGTGGT TGGCACCTAT AGTCCCAGCT ACTCGGAAGG 2760 

CTGAAGCAGG AGAATGGTAT GAATCCAGGA GGTGGAGCTT GCAGTGAGCC GAGACCGTGC 2820 

CACTGCACTC CAGCCTGGGC AACACAGCGA GACTCCGTCT CGAGGAAAAA AAAAGAAAAG 2880 

ACGCGTACCT GCGGTGAGGA AGCTGGGCGC TGTTTTCGAG TTCAGGTGAA TTAGCCTCAA 2940 

TCCCCGTGTT CACTTGCTCC CATAGCCCTC TTGATGGATC ACGTAAAACT GAAAGGCAGC 3000 

GGGGAGCAGA CAAAGATGAG GTCTACACTG TCCTTCATGG GGATTAAAGC TATGGTTATA 3060 

TTAGCACCAA ACTTCTACAA ACCAAGCTCA GGGCCCCAAC CCTAGAAGGG CCCAAATGAG 3120 

AGAATGGTAC TTAGGGATGG AAAACGGGGC CTGGCTAGAG CTTCGGGTGT GTGTGTCTGT 3180 

CTGTGTGTAT GCATACATAT GTGTGTATAT ATGGTTTTGT CAGGTGTGTA AATTTGCAAA 3240 

TTGTTTCCTT TATATATGTA TGTATATATA TATATGAAAA TATATATATA TATGAAAAAT 3300 

AAAGCTTAAT TGTCCCAGAA AATCATACAT TGCTTTTTTA TTCTACATGG GTACCACAGG 3360 

AACCTGGGGG CCTGTGAAAC TACAACCAAA AGGCACACAA AACCGTTTCC AGTTGGCAGC 3420 

AGAGATCAGG GGTTACCTCT GCTTCTGAGC AAATGGCTCA AGCTCTACCA GAGCAGACAG 3480 

CTACCCTACT TTTCAGCAGC AAAACGTCCC GTATGACGCA GCACGAAGGG CCTGGCAGGC 3540 
TGTTAGCAGG AGCTATGTCC CTTCCTATCG TTTCCGTCCA CTT 

Seq ID NO: 553 Protein sequence 
Protein Accession #: NP_006491.1 

1 11 21 31 

I I I I 

GLPRLVCAFL LAACCCCPRV AGVPGEAEQP APELVEVEVG 
WFSVHKEKRT LIFRVRQGQG QSEPGEYEQR LSLQDRGATL 
RSQEYRIQLR VYKAPEEPNI QVNPLGIPVN SKEPBEVATC 
KEEKNRVHIQ SSQTVESSGL YTLQSILKAQ LVKEDKDAQF 
TVPVFYPTEK VWLEVEPVGM LKEGDRVEIR CLADGNPPPH 
NGVLVLEPAR KEHSGRYECQ AWNLDTMISL LSEPOELLVN 
TLTCEAESSQ DLEFQWLREE TDQVLERGPV LQLHDLKREA 
LVKLAIFGPP WMAFKERKVW VKENMVLNLS CEASGHPRPT 
STLNVLVTPE LLETGVECTA SNDLGKNTSI LFLELVNLTT 
RANSTSTERK LPEPESRGW IVAVIVCILV LAVLGAVLYF 
PSRKTELWE VKSDKLPEEM GLLQGSSGDK RAPGDQGEKY 

Seq ID NO: 554 DNA sequence 
Nucleic Acid Accession #: NM_003183.3 
Coding sequence: 165.. 2639 

1 11 21 31 41 51 

I I I I I I 

TCGAGCCTGG CGGTAGAATC TTCCCAGTAG GCGGCGCGGG AGGGAAAAGA GGATTGAGGG 60 

GCTAGGCCGG GCGGATCCCG TCCTCCCCCG ATGTGAGCAG TTTTCCGAAA CCCCGTCAGG 120 

CGAAGGCTGC CCAGAGAGGT GGAGTCGGTA GCGGGGCCGG GAACATGAGG CAGTCTCTCC 180 

TATTCCTGAC CAGCGTGGTT CCTTTCGTGC TGGCGCCGCG ACCTCCGGAT GACCCGGGCT 240 

TCGGCCCCCA CCAGAGACTC GAGAAGCTTG ATTCTTTGCT CTCAGACTAC GATATTCTCT 300 

CTTTATCTAA TATCCAGCAG CATTCGGTAA GAAAAAGAGA TCTACAGACT TCAACACATG 360 

TAGAAACACT ACTAACTTTT TCAGCTTTGA AAAGGCATTT TAAATTATAC CTGACATCAA 420 

GTACTGAACG' TTTTTCACAA AATTTCAAGG TCGTGGTGGT GGATGGTAAA AACGAAAGCG 480 

AGTACACTGC AAAATGGCAG GACTTCTTCA CTGGACACGT GGTTGGTGAG CCTGACTCTA 540 

GGGTTCTAGC CCACATAAGA GATGATGATG TTATAATCAG AATCAACACA GATGGGGCCG 600 

AATATAACAT AGAGCCACTT TGGAGATTTG TTAATGATAC CAAAGACAAA AGAATGTTAG 660 

TTTATAAATC TGAAGATATC AAGAATGTTT CACGTTTGCA GTCTCCAAAA GTGTGTGGTT 720 

ATTTAAAAGT GGATAATGAA GAGTTGCTCC CAAAAGGGTT AGTAGACAGA GAACCACCTG 780 

AAGAGCTTGT TCATCGAGTG AAAAGAAGAG CTGACCCAGA TCCCATGAAG AACACGTGTA 840 

AATTATTGGT GGTAGCAGAT CATCGCTTCT ACAGATACAT GGGCAGAGGG GAAGAGAGTA 900 

CAACTACAAA TTACTTAATA GAGCTAATTG ACAGAGTTGA TGACATCTAT CGGAACACTT 960 

CATGGGATAA TGCAGGTTTT AAAGGCTATG GAATACAGAT AGAGCAGATT CGCATTCTCA 1020 

AGTCTCCACA AGAGGTAAAA CCTGGTGAAA AGCACTACAA CATGGCAAAA AGTTACCCAA 1080 

ATGAAGAAAA GGATGCTTGG GATGTGAAGA TGTTGCTAGA GCAATTTAGC TTTGATATAG 1140 

CTGAGGAAGC ATCTAAAGTT TGCTTGGCAC ACCTTTTCAC ATACCAAGAT TTTGATATGG 1200 

GAACTCTTGG ATT AG CTT AT GTTGGCTCTC CCAGAGCAAA CAGCCATGGA GGTGTTTGTC 1260 

CAAAGGCTTA TTATAGCCCA GTTGGGAAGA AAAATATCTA TTTGAATAGT GGTTTGACGA 1320 

GCACAAAGAA TTATGGTAAA ACCATCCTTA CAAAGGAAGC TGACCTGGTT ACAACTCATG 1380 



41 51 
I I 

STALLKCGLS QSQGNLSHVD 60 

ALTQVTPQDE RIFLCQGKRP 120 

VGRNGYPIPQ VIWYKNGRPli 180 

YCELNYRLPS GNHMKESREV 240 

FSISKQNPST REAEEETTND 300 

YVSDVRVSPA APERQEGSSL 360 

GGGYRCVASV PSIPGLNRTQ 420 

ISWNVNGTAS EQDQDPQRVL 480 

LTPDSNTTTG LSTSTASPHT 540 

LYKKGKLPCR RSGKQEITLP 600 
IDLRH 
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Seq ID NO: S55 Protein sequence 
Protein Accession #: NPJ)03174.2 

l 11 21 31 

I I I I 



TCTAGCAGAA 


TGTGCCCOGA 


1440 


TGTGAGTGGC 


GATCACGAGA 


1500 


TAAGACCATT 


GAAAGTAAGG 


1560 


GAACTCGAGG 


GTGGATGAAG 


1620 


CACCTGCTGC 


AACAGCGACT 


1680 


TCCTTGCTGT 


AAAAACTGTC 


1740 


TGCTACTTGC 


AAAGGCGTGT 


1800 


AAATGCTGAA 


AATGACACTG 


1860 


CCCTTTCTGC 


GAGAGGGAAC 


1920 


CTGCAAGGTG 


TGCTGCAGGG 


1980 


AAAGAACTTA 


TTTTTGAGGA 


2040 


CAAATGTGAG 


AAACGAGTAC 


2100 


GAGCATCAAT 


ACTTTTGGAA 


2160 


CTCCTTGATA 


TTTTGGATTC 


2220 


TAAACAGTAT 


GAATCTCTGT 


2280 


GGATTCTGCA 


TCGGTTCGCA 


2340 


GCAGCCTGCC 


CCTGTGATCC 


2400 


GGACACCATC 


CAGGAAGACC 


2460 


GGACCCCTTC 


CCAAATAGCA 


2520 


GGTCGCCAGA 


AGTGAAAAGG 


2580 


CAAAGAAACA 


GAGTGCTAAT 


2640 


TTATAGATTT 


GACCTACAAA 


2700 


TAGCAGATGC 


TGGTCATGTG 


2760 


CCCTTCTCCT 


TTTGAAAAGG 


2820 


TTTAGTTTTT 


AAAATATCTT 


2880 


GTTATGAATA 


TTTACGTTTT 


2940 


GAAATGATCA 


GTTTTTTTTT 


3000 


TTATTTGTGA 


GAAAAGTGGA' 


3060 


AAACAAAGGA 


GATAAATTTA 


3120 


TACCCAGAGT 


TTTTATGTAG 


3180 


AATATGGCTC 


TTCATAATTC 


3240 


GGGCTATACA 


TGGTAGCCAG 


3300 


TGCTGGGCAG 


TTTTTCTGTA 


3360 


TACATGTCTT 


AGAAAATTCA 


3420 


ACTTGGAGAG 


GCTGAGGTTG 


3460 


CTC 
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1 

DYDILSLSNI 


! 

QQHSVRKRDL 


60 


GKNESEYTAK 


WQDFFTGHW 


120 


DKRMLVYKSE 


DIKNVSRLQS 


180 


MKNTCKLLW 


ADHRFYRYMG 


240 


QIRILKSPQE 


VKPGEKHYNM 


300 


QDFDMGTLGL 


AYVGSPRANS 


360 


LVTTHELGHN 


FGAEHDPDGL 


420 


TIESKAQECP 


QERSNKVCGN 


480 


CCKNCQFETA 


QKKCQEAINA 


540 


FCEREQQLES 


CACNETDNSC 


600 


CEKRVQDVIE 


RFWDF IDQLS 


. 660 


QYESLSLFHP 


SNVEMLSSMD 


720 


TIQEDPSTDS 


HMDEDGFEKD 


780 


ETEC 







Seq ID NO: 556 DNA sequence 

Nucleic Acid Accession §: NM_021832.1 

Coding sequence: 164.. 2248 

1 11 21 31 41 51 

I I I I 1-1 

TCGAGCCTGG CGGTAGAATC TTCCCAGTAG GCGGCGCGGG AGGAAAAGAG GATTGAGGGG 60 

CTAGGCCGGG CGGATCCCGT CCTCCCCCGA TGTGAGCAGT TTTCCGAAAC CCCGTCAGGC 120 

GAAGGCTGCC CAGAGAGGTG GAGTCGGTAG CGGGGCCGGG AACATGAGGC AGTCTCTCCT 180 

ATTCCTGACC AGCGTGGTTC CTTTCGTGCT GGCGCCGCGA CCTCCGGATG ACCCGGGCTT 240 

CGGCCCCCAC CAGAGACTCG AGAAGCTTGA TTCTTTGCTC TCAGACTACG AT ATT CT CTC 300 

TTTATCTAAT ATCCAGCAGC ATTCGGTAAG AAAAAGAGAT CTACAGACTT CAACACATGT 360 

AGAAACACTA CTAACTTTTT CAGCTTTGAA AAGGCATTTT AAATTATACC TGACATCAAG 420 

TACTGAACGT TTTTCACAAA ATTTCAAGGT CGTGGTGGTG GATGGTAAAA ACGAAAGCGA 480 

GTACACTGTA AAATGGCAGG ACTTCTTCAC TGGACACGTG GTTGGTGAGC CTGACTCTAG 540 

GGTTCTAGCC CACATAAGAG ATGATGATGT TATAATCAGA ATCAACACAG ATGGGGCCGA 600 

ATATAACATA GAGCCACTTT GGAGATTTGT TAATGATACC AAAGACAAAA GAATGTTAGT 660 

TTATAAATCT GAAGATATCA AGAATGTTTC ACGTTTGCAG TCTCCAAAAG TGTGTGGTTA 720 

TTTAAAAGTG GATAATGAAG AGTTGCTCCC AAAAGGGTTA GTAGACAGAG AACCACCTGA 780 

AGAGCTTGTT CATCGAGTGA AAAGAAGAGC TGACCCAGAT CCCATGAAGA ACACGTGTAA 840 

ATTATTGGTG GTAGCAGATC ATCGCTTCTA CAGATACATG GGCAGAGGGG AAGAGAGTAC 900 

AACTACAAAT TACTTAATAG AGCTAATTGA CAGAGTTGAT GACATCTATC GGAACACTTC 960 

ATGGGATAAT GCAGGTTTTA AAGGCTATGG AATACAGATA GAGCAGATTC GCATTCTCAA 1020 

GTCTCCACAA GAGGTAAAAC CTGGTGAAAA GCACTACAAC ATGGCAAAAA GTTACCCAAA 10B0 

TGAAGAAAAG GATGCTTGGG ATGTGAAGAT GTTGCTAGAG CAATTTAGCT TTGATATAGC 1140 

TGAGGAAGCA TCTAAAGTTT GCTTGGCACA CCTTTTCACA TACCAAGATT TTGATATGGG 1200 

AACTCTTGGA TTAGCTTATG TTGGCTCTCC CAGAGCAAAC AGCCATGGAG GTGTTTGTCC 1260 

AAAGGCTTAT TATAGCCCAG TTGGGAAGAA AAATATCTAT TTGAATAGTG GTTTGACGAG 1320 

CACAAAGAAT TATGGTAAAA CCATCCTTAC AAAGGAAGCT GACCTGGTTA CAACTCATGA 1380 

ATTGGGACAT AATTTTGGAG CAGAACATGA TCCGGATGGT CTAGCAGAAT GTGCCCCGAA 1440 
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TGAGGACCAG GGAGGGAAAT ATGTCATGTA TCCCATAGCT GTGAGTGGCG ATCACGAGAA 1500 

CAATAAGATG TTTTCAAACT GCAGTAAACA ATCAATCTAT AAGACCATTG AAAGTAAGGC 1560 

CCAGGAGTGT TTTCAAGAAC GCAGCAATAA AGTTTGTGGG AACTCGAGGG TGGATOAAGG 1620 

AGAAGAGTGT GATCCTGGCA TCATGTATCT GAACAACGAC ACCTGCTGCA ACAGCGACTG 1680 

CACGTTGAAG GAAGGTGTCC AGTGCAGTGA CAGGAACAGT CCTTGCTGTA AAAACTGTCA 1740 

GTTTGAGACT GCCCAGAAGA AGTGCCAGGA GGCGATTAAT GCTACTTGCA AAGGCX3TGTC 1800 

CTACTGCACA GGTAATAGCA GTGAGTGCCC GCCTCCAGGA AATGCTGAAG ATGACACTGT 1860 

TTGCTTGGAT CTTGGCAAGT GTAAGGATGG GAAATGCATC CCTTTCTGCG AGAGGGAACA 1920 

GCAGCTGGAG TCCTGTGCAT GTAATGAAAC TGACAACTCC TGCAAGGTGT GCTGCAGGGA 1980 

CCTTTCCGGC CGCTGTGTGC CCTATGTCGA TGCTGAACAA AAGAACTTAT TTTTGAGGAA 2040 

AGGAAAGCCC TGTACAGTAG GATTTTGTGA CATGAATGGC AAATGTGAGA AACGA GTACA 2100 

GGATGTAATT GAACGATTTT GGGATTTCAT TGACCAGCTG AGCATCAATA CTTTTGGAAA 2160 

GTTTTTAGCA GACAACATCG TTGGGTCTGT CCTGGTTTTC TCCTTGATAT TTTGGATTCC 2220 

TTTCAGCATT CTTGTCCATT GTGTGTAACG TCGAAATGCT GAGCAGCATG GATTCTGCAT 2280 

CGGTTCGCAT TATCAAACCC TTTCCTGCGC CCCAGACTCC AGGCCGCCTG CAGCCTGCCC 2340 

CTGTGATCCC TTCGGCGCCA GCAGCTCCAA AACTGGACCA CCAGAGAATG GACACCATCC 2400 

AGGAAGACCC CAGCACAGAC TCACATATGG ACGAGGATGG GTTTGAGAAG GACCCCTTCC 2460 

CAAATAGCAG CACAGCTGCC AAGTCATTTG AGGATCTCAC GGACCATCCG GTCACCAGAA 2520 

GTGAAAAGGC TGCCTCCTTT AAACTGCAGC GTCAGAATCG TGTTGACAGC AAAGAAACAG 2580 

AGTGCTAATT TAGTTCTCAG CTCTTCTGAC TTAAGTGTGC AAAATATTTT TATAGATTTG 2640 

ACCTACAATC AATCACAGCT TATATTTTGT GAAGACTGGG AAGTGACTTA GCAGATGCTG 2700 

GTCATGTGTT TGAACTTCCT GCAGGTAAAC AGTTCTTGTG TGGTTTGGCC CTTCTCCTTT 2760 

TGAAAAGGTA AGGTGAAGGT GAATCTAGCT TATTTTGAGG CTTTCAGGTT TTAGTTTTTA 2820 

AAATATCTTT TGACCTGTGG TGCAAAAGCA GAAAATACAG CTGGATTGGG TTATGAGTAT 2880 

TTACGTTTTT GTAAATTAAT CTTTTATATT GATAACAGGC ACTGACTAGG GA AATGA TCA 2940 

GTTTTTTTTT ATACACTGTA ATGAACCGCT GAATATGAAG CATTTGGCAT TTATTTGTGA 3000 

GAAAAGTGGA ATAGTTTTTT TTTTTTTTTT TTTTTTTTGC CTTCAACTAA AAACAAAGGA 3060 

GATAAATTTA GTATACATTG TATCTAAATT GTGGGTCTAT TTCTAGTTAT TACCCAGAGT 3120 

TTTTATGTAG CAGGGAAAAT ATATATCTAA ATTTAGAAAT CATTTGGGTT AATATGGCTC 3180 

TTCATAATTC TAAGACTAAT GCTCAGAACC TAACCACTAC CTTACAGTGA GGGCTATACA 3240 

TGGTAGCCAG TTGAATTTAT GGAATCTACC AACTGTTTAG GGCCCTGATT TGCTGGGCAG 3300 

TTTTTCTGTA TTTTATAAGT ATCTTCATGT ATCCCTGTTA CTGATAGGGA TACATGTCTT 3360 

AGAAAATTCA CTATTGGCTG GGAGTGGTGG CTCATGCCTG TAATCCCAGC ACTTGGAGAG 3420 
3421 GCTGAGGTTG CGCCACTACA CTCCAGCCTG GGTGACAGAG TGAGATCTGC CTC 

Seq ID NO: 557 Protein sequence 
Protein Accession ft: NPJ)68604.1 

1 11 21 31 41 51 

I I I I I I 

MRQSLLFLTS WPFVLAPRP PDDPGFGPHQ RLEKLDSLLS DYDILSLSNI QQHSVRKRDL 60 

QTSTHVETLL TFSALKRHFK LYLTSSTERF SQNFKWWD GKNESEYTVK WQDFFTGHW 120 

GEPDSRVLAH IRDDDVIIRI NTDGAEYNIE PLWRFVNDTK DKRMLVYKSE DIKNVSRLQS 180 

PKVCGYLKVD NEELLPKGLV DREPPEELVH RVKRRADPDP MKNTCKLLW ADHRFYRYMG 240 

RGEESTTTNY LIELIDRVDD IYRNTSWDNA GFKGYGIQIE QIRILKSPQE VKPGEKHYNM 300 

AKSYPNEEKD AWDVKMLLEQ FSFDIAEEAS KVCLAHLFTY QDFDMGTLGL AYVGSPRANS 3 60 

HGGVCPKAYY SPVGKKNIYL NSGLTSTKNY GKTILTKEAD LVTTHELGHN FGAEHDPDGL 420 

AECAPNEDQG GKYVMYPIAV SGDHENNKMF SNCSKQSIYK TIESKAQECF QERSNKVCGN 480 

SRVDEGEECD PGIMYLNNDT CCNSDCTLKE GVQCSDRNSP CCKNCQFETA QKKCQEAINA 540 

TCKGVSYCTG NSSECPPPGN AEDDTVCLDL GKCKDGKCIP FCEREQQLES CACNETDNSC ' 600 

KVCCRDLSGR CVPYVDAEQK NLFLRKGKPC TVGFCDMNGK CEKRVQDVIE RFWDFIDQLS 660 
INTFGKFLAD NIVGSVLVFS LIFWIPFSIL VHCV 

Seq ID NO: 558 DNA sequence 

Nucleic Acid Accession #: NMJ)04994.1 

Coding sequence: 20.. 2143 

1 11 21 31 41 51 

I I I I I I 

AGACACCTCT GCCCTCACCA TGAGCCTCTG GCAGCCCCTG GTCCTGGTGC TCCTGGTGCT 60 

GGGCTGCTGC TTTGCTGCCC CCAGACAGCG CCAGTCCACC CTTGTGCTCT TCCCTGGAGA 120 

CCTGAGAACC AATCTCACCG ACAGGCAGCT GGCAGAGGAA TACCTGTACC GCTATGGTTA 180 

CACTCGGGTG GCAGAGATGC GTGGAGAGTC GAAATCTCTG GGGCCTGCGC TGCTGCTTCT 240 

CCAGAAGCAA CTGTCCCTGC CCGAGACCGG TGAGCTGGAT AGCGCCACGC TGAAGGCCAT 300 

GCGAACCCCA CGGTGCGGGG TCCCAGACCT GGGCAGATTC CAAACCTTTG AGGGCGACCT 360 

CAAGTGGCAC CACCACAACA TCACCTATTG GATCCAAAAC TACTCGGAAG ACTTGCCGCG 420 

GGCGGTGATT GACGACGCCT TTGCCCGCGC CTTCGCACTG TGGAGCGCGG TGACGCCGCT 480 

CACCTTCACT CGCGTGTACA GCCGGGACGC AGACATCGTC ATCCAGTTTG GTGTCGCGGA 540 

GCACGGAGAC GGGTATCCCT TCGACGGGAA GGACGGGCTC CTGGCACACG CCTTTCCTCC 600 

TGGCCCCGGC ATTCAGGGAG ACGCCCATTT CGACGATGAC GAGTTGTGGT CCCTGGGCAA 660 

GGGCGTCGTG GTTCCAACTC GGTTTGGAAA CGCAGATGGC GCGGCCTGCC ACTTCCCCTT 720 

CATCTTCGAG GGCCGCTCCT ACTCTGCCTG CACCACCGAC GGTCGCTCCG ACGGCTTGCC 780 

CTGGTGCAGT ACCACGGCCA ACTACGACAC CGACGACCGG TTTGGCTTCT GCCCCAGCGA 840 

GAGACTCTAC ACCCGGGACG GCAATGCTGA TGGGAAACCC TGCCAGTTTC CATTCATCTT 900 

CCAAGGCCAA TCCTACTCCG CCTGCACCAC GGACGGTCGC TCCGACGGCT ACCGCTGGTG 960 

CGCCACCACC GCCAACTACG ACCGGGACAA GCTCTTCGGC TTCTGCCCGA CCCGAGCTGA 1020 

CTCGACGGTG ATGGGGGGCA ACTCGGCGGG GGAGCTGTGC GTCTTCCCCT TCACTTTCCT 1080 

GGGTAAGGAG TACTCGACCT GTACCAGCGA GGGCCGCGGA GATGGGCGCC TCTGGTGCGC 1140 

TACCACCTCG AACTTTGACA GCGACAAGAA GTGGGGCTTC TGCCCGGACC AAGGATACAG 1200 

TTTGTTCCTC GTGGCGGCGC ATGAGTTCGG CCACGCGCTG GGCTTAGATC ATTCCTCAGT 1260 

GCCGGAGGCG CTCATGTACC CTATGTACCG CTTCACTGAG GGGCCCCCCT TGCATAAGGA 1320 

CGACGTGAAT GGCATCCGGC ACCTCTATGG TCCTCGCCCT GAACCTGAGC CACGGCCTCC 1380 

AACCACCACC ACACCGCAGC CCACGGCTCC CCCGACGGTC TGCCCCACCG GACCCCCCAC 1440 

TGTCCACCCC TCAGAGCGCC CCACAGCTGG CCCCACAGGT CCCCCCTCAG CTGGCCCCAC 1500 

AGGTCCCCCC ACTGCTGGCC CTTCTACGGC CACTACTGTG CCTTTGAGTC CGGTGGACGA 1560 

TGCCTGCAAC GTGAACATCT TCGACGCCAT CGCGGAGATT GGGAACCAGC TGTATTTGTT 1620 

CAAGGATGGG AAGTACTGGC GATTCTCTGA GGGCAGGGGG AGCCGGCCGC AGGGCCCCTT 1680 
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CCTTATCGCC GACAAGTGGC CCGCGCTGCC CCGCAAGCTG GACTCGGTCT TTGAGGAGCC 1740 

GCTCTCCAAG AAGCTTTTCT TCTTCTCTGG GCGCCAGGTG TGGGTGTACA CAGGCGOGTC 1800 

GGTGCTGGGC CCGAGGCGTC TGGACAAGCT GGGCCTGGGA GCCGACGTGG CCCAGGTGAC 1860 

CGGGGCCCTC CGGAGTGGCA GGGGGAAGAT GCTGCTGTTC AGCGGGCGGC GCCTCTGGAG 1920 

GTTCGACGTG AAGGCGCAGA TGGTGGATCC CCGGAGCGCC AGCGAGGTGG ACCGGATGTT 1980 

CCCCGGGGTG CCTTTGGACA CGCACGACGT CTTCCAGTAC CGAGAGAAAG CCTATTTCTG 2040 

CCAGGACCGC TTCTACTGGC GCGTGAGTTC CCGGAGTGAG TTGAACCAGG TGGACCAAGT 2100 

GGGCTACGTG ACCTATGACA TCCTGCAGTG CCCTGAGGAC TAGGGCTCCC GTCCTGCTTT 2160 

GCAGTGCCAT GTAAATCCCC ACTGGGACCA ACCCTGGGGA AGGAGCCAGT TTGCCGGATA 2220 

CAAACTGGTA TTCTGTTCTG GAGGAAAGGG AGGAGTGGAG GTGGGCTGGG CCCTCTCTTC 2280 
TCACCTTTGT TTTTTGTTGG AGTGTTTCTA ATAAACTTGG ATTCTCTAAC CTTT 

Seq ID NO: 5S9 Protein sequence 
Protein Accession #: NPJ504985.1 

1 11 21 31 41 51 

I I I I I I 

MSLWQPLVLV LLVLGCCFAA PRQRQSTLVL FPGDLRTNLT DRQLAEEYLY RYGYTRVAEM 60 

RGESKSLGPA LLLLQKQLSL PETGELOSAT LKAMRTPRCG VPDLGRFQTF EGDLKWHHHN 120 

ITYWIQNYSE DLPRAVIDDA FARAFALWSA VTPLTFTRVY SRDADIVIQP GVAEHGDGYP 180 

FDGKDGIiLAH APPPGPGIQG DAHFDDDELW SLGKGVWPT RFGNADGAAC HFPFIFEGRS 240 

YSACTTDGRS DGLPWCSTTA NYDTDDRFGF CPSERLYTRD GNADGKPCQF PFIFQGQSYS 300 

ACTTDGRSDG YRWCATTANY DRDKLFGFCP TRADSTVMGG NSAGELCVFP FTFLGKEYST 360 

CTSEGRGDGR LWCATTSNFD SDKKWGFCPD QGYSLFLVAA HEFGHALGLD HSSVPEALMY 420 

PMYRFTEGPP LHKDDVNGIR HLYGPRPEPE PRPPTTTTPQ PTAPPTVCPT GPPTVHPSER 480 

PTAGPTGPPS AGPTGPPTAG PSTATTVPLS PVDDACNVNI FDAIAEIGNQ LYLFKDGKYW 540 

RFSEGRGSRP QGPFLIADKW PALPRKLDSV FEEPLSKKLF FFSGROVWVY TGASVLGPRR 600 

LDKLGLGADV AQVTGALRSG RGKMLLFSGR RLWRFDVKAQ MVDPRSASEV DRMFPGVPLD 660 
THDVFQYREK AYFCQDRFYW RVSSRSELNQ VDQVOYVTYD ILQCPED 

Seq ID NO: 560 DNA sequence 

Nucleic Acid Accession #: NM_00 0213.1 

Coding sequence: 127.. 5385 

1 U 21 31 41 51 

I I I I I I 

CGCCCGCGCG CTGCAGCCCC ATCTCCTAGC GGCAGCCCAG GCGCGGAGGG AGCGAGTCCG 60 

CCCCGAGGTA GGTCCAGGAC GGGCGCACAG CAGCAGCCGA GGCTGGCCGG GAGAGGGAGG 120 

AAGAGGATGG CAGGGCCACG CCCCAGCCCA TGGGCCAGGC TGCTCCTGGC AGCCTTGATC 1B0 

AGCGTCAGCC TCTCTGGGAC CTTGGCAAAC CGCTGCAAGA AGGCCCCAGT GAAGAGCTGC 240 

ACGGAGTGTG TCCGTGTGGA TAAGGACTGC GCCTACTGCA CAGACGAGAT GTTCAGGGAC 300 

CGGCGCTGCA ACACCCAGGC GGAGCTGCTG GCCGCGGGCT GCCAGCGGGA GAGCATCGTG 360 

GTCATGGAGA GCAGCTTCCA AATCACAGAG GAGACCCAGA TTGACACCAC CCTGCGGCGC 420 

AGCCAGATGT CCCCCCAAGG CCTGCGGGTC CGTCTGCGGC CCGGTGAGGA GCGGCATTTT 480 

GAGCTGGAGG TGTTTGAGCC ACTGGAGAGC CCCGTGGACC TGTACATCCT CATGGACTTC 540 

TCCAACTCCA TGTCCGATGA TCTGGACAAC CTCAAGAAGA TGGGGCAGAA CCTGGCTCGG 600 

GTCCTGAGCC AGCTCACCAG CGACTACACT ATTGGATTTG GCAAGTTTGT GGACAAAGTC 660 

AGCGTCCCGC AGACGGACAT GAGGCCTGAG AAGCTGAAGG AGCCCTGGCC CAACAGTGAC 720 

CCCCCCTTCT CCTTCAAGAA CGTCATCAGC CTGACAGAAG ATGTGGATGA GTTCCGGAAT 780 

AAACTGCAGG GAGAGCGGAT CTCAGGCAAC CTGGATGCTC CTGAGGGCGG CTTCGATGCC 840 

ATCCTGCAGA CAGCTGTGTG CACGAGGGAC ATTGGCTGGC GCCCGGACAG CACCCACCTG 900 

CTGGTCTTCT CCACCGAGTC AGCCTTCCAC TATGAGGCTG ATGGCGCCAA CGTGCTGGCT 960 

GGCATCATGA GCCGCAACGA TGAACGGTGC CACCTGGACA CCACGGGCAC CTACACCCAG 1020 

TACAGGACAC AGGACTACCC GTCGGTGCCC ACCCTGGTGC GCCTGCTCGC CAAGCACAAC 1080 

ATCATCCCCA TCTTTGCTGT CACCAACTAC TCCTATAGCT ACTACGAGAA GCTTCACACC 1140 

TATTTCCCTG TCTCCTCACT GGGGGTGCTG CAGGAGGACT CGTCCAACAT CGTGGAGCTG 1200 

CTGGAGGAGG CCTTCAATCG GATCCGCTCC AACCTGGACA TCCGGGCCCT AGACAGCCCC 1260 

CGAGGCCTTC GGACAGAGGT CACCTCCAAG ATGTTCCAGA AGACGAGGAC TGGGTCCTTT 1320 

CACATCCGGC GGGGGGAAGT GGGTATATAC CAGGTGCAGC TGCGGGCCCT TGAGCACGTG 1380 

GATGGGACGC ACGTGTGCCA GCTGCCGGAG GACCAGAAGG GCAACATCCA TCTGAAACCT 1440 

TCCTTCTCCG ACGGCCTCAA GATGGACGCG GGCATCATCT GTGATGTGTG CACCTGCGAG 1500 

CTGCAAAAAG AGGTGCGGTC AGCTCGCTGC AGCTTCAACG GAGACTTCGT GTGCGGACAG 1560 

TGTGTGTGCA GCGAGGGCTG GAGTGGCCAG ACCTGCAACT GCTCCACCGG CTCTCTGAGT 1620 

GACATTCAGC CCTGCCTGCG GGAGGGCGAG GACAAGCCGT GCTCCGGCCG TGGGGAGTGC 1680 

CAGTGCGGGC ACTGTGTGTG CTACGGCGAA GGCCGCTACG AGGGTCAGTT CTGCGAGTAT 1740 

GACAACTTCC AGTGTCCCCG CACTTCCGGG TTCCTCTGCA ATGACCGAGG ACGCTGCTCC 1800 

ATGGGCCAGT GTGTGTGTGA GCCTGGTTGG ACAGGCCCAA GCTGTGACTG TCCCCTCAGC 1860 

AATGCCACCT GCATCGACAG CAATGGGGGC ATCTGTAATG GACGTGGCCA CTGTGAGTGT 1920 

GGCCGCTGCC ACTGCCACCA GCAGTCGCTC TACACGGACA CCATCTGCGA GATCAACTAC 1980 

TCGGCGATCC ACCCGGGCCT CTGCGAGGAC CTACGCTCCT GCGTGCAGTG CCAGGCGTGG 2040 

GGCACCGGCG AGAAGAAGGG ■ GCGCACGTGT GAGGAATGCA ACTTCAAGGT CAAGATGGTG 2100 

GACGAGCTTA AGAGAGCCGA GGAGGTGGTG GTGCGCTGCT CCTTCCGGGA CGAGGATGAC 2160 

GACTGCACCT ACAGCTACAC CATGGAAGGT GACGGCGCCC CTGGGCCCAA CAGCACTGTC 2220 

CTGGTGCACA AGAAGAAGGA CTGCCCTCCG GGCTCCTTCT GGTGGCTCAT CCCCCTGCTC 2280 

CTCCTCCTCC TGCCGCTCCT GGCCCTGCTA CTGCTGCTAT GCTGGAAGTA CTGTGCCTGC 2340 

TGCAAGGCCT GCCTGGCACT TCTCCCGTGC TGCAACCGAG GTCACATGGT GGGCTTTAAG 2400 

GAAGACCACT ACATGCTGCG GGAGAACCTG ATGGCCTCTG ACCACTTGGA CACGCCCATG 2460 

CTGCGCAGCG GGAACCTCAA GGGCCGTGAC GTGGTCCGCT GGAAGGTCAC CAACAACATG 2520 

CAGCGGCCTG GCTTTGCCAC TCATGCCGCC AGCATCAACC CCACAGAGCT GGTGCCCTAC 2580 

GGGCTGTCCT TGCGCCTGGC CCGCCTTTGC ACCGAGAACC TGCTGAAGCC TGACACTCGG 2640 

GAGTGCGCCC AGCTGCGCCA GGAGGTGGAG GAGAACCTGA ACGAGGTCTA CAGGCAGATC 2700 

TCCGGTGTAC ACAAGCTCCA GCAGACCAAG TTCCGGCAGC AGCCCAATGC CGGGAAAAAG 2760 

CAAGACCACA CCATTGTGGA CACAGTGCTG ATGGCGCCCC GCTCGGCCAA GCCGGCCCTG 2820 

CTGAAGCTTA CAGAGAAGCA GGTGGAACAG AGGGCCTTCC ACGACCTCAA GGTGGCCCCC 2880 

GGCTACTACA CCCTCACTGC AGACCAGGAC GCCCGGGGCA TGGTGGAGTT CCAGGAGGGC 2940 

GTGGAGCTGG TGGACGTACG GGTGCCCCTC TTTATCCGGC CTGAGGATGA CGACGAGAAG 3000 

CAGCTGCTGG TGGAGGCCAT CGACGTGCCC GCAGGCACTG CCACCCTCGG CCGCCGCCTG 3060 
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GTAAACATCA CCATCATCAA GGAGCAAGCC AGAGACGTGG TGTCCTTTGA GCAGCCTGAG 3120 

TTCTCGGTCA GCCGCGGGGA CCAGGTGGCC CGCATCCCTG TCATCCGGCG TGTCCTGGAC 3180 

GGCGGGAAGT CCCAGGTCTC CTACCGCACA CAGGATGGCA CCGCGCAGGG CAACCGGGAC 3240 

TACATCCCCQ TGGAGGGTGA GCTGCTGTTC CAGCCTGGGG AGGCCTGGAA AGAGCTGCAG 3300 

GTGAAGCTCC TGGAGCTGCA AGAAGTTGAC TCCCTCCTGC GGGGCOGCCA GGTCCGCCGT 3360 

TTCCACGTCC AGCTCAGCAA CCCTAAGTTT GGGGCCCACC TGGGCCAGCC CCACTCCACC 3420 

ACCATCATCA TCAGGGACCC AGATGAACTG GACCGGAGCT TCACGAGTCA GATGTTGTCA 3480 

TCACAGCCAC CCCCTCACGG CGACCTGGGC GCCCCGCAGA ACCCCAATGC TAAGGCOGCT 3540 

GGGTCCAGGA AGATCCATTT CAACTGGCTG CCCCCTTCTG GCAAGCCAAT GGGGTACAGG 3600 

GTAAAGTACT GGATTCAGGG TGACTCCGAA TCCGAAGCCC ACCTGCTCGA CAGCAAGGTG 3660 

CCCTCAGTGG AGCTCAGCAA CCTGTACCCG TATTGCGACT ATGAGATGAA GGTGTGCGCC 3720 

TACGGGGCTC AGGGCGAGGG ACCCTACAGC TCCCTGGTGT CCTGCCGCAC CCACCAGGAA 3780 

GTGCCCAGCG AGCCAGGGCG TCTGGCCTTC AATGTCGTCT CCTCCACGGT GACCCAGCTG 3840 

AGCTGGGCTG AGCCGGCTGA GACCAACGGT GAGATCACAG CCTACGAGGT CTGCTATGGC 3900 

CTGGTCAACG ATGACAACCG ACCTATTGGG CCCATGAAGA AAGTGCTGGT TGACAACCCT 3960 

AAGAACCGGA TGCTGCTTAT TGAGAACCTT CGGGAGTCCC AGCCCTACCG CTACACGGTG 4020 

AAGGCGCGCA ACGGGGCCGG CTGGGGGCCT GAGCGGGAGG CCATCATCAA CCTGGCCACC 4080 

CAGCCCAAGA GGCCCATGTC CATCCCCATC ATCCCTGACA TCCCTATCGT GGACGCCCAG 4140 

AGCGGGGAGG ACTACGACAG CTTCCTTATG TACAGCGATG ACGTTCTAOG CTCTCCATOG 4200 

GGCAGCCAGA GGCCCAGCGT CTCCGATGAC ACTGAGCACC TGGTGAATGG CCGGATGGAC 4260 

TTTGCCTTCC CGGGCAGCAC CAACTCCCTG CACAGGATGA CCACGACCAG TGCTGCTGCC 4320 

TATGGCACCC ACCTGAGCCC ACACGTGCCC CACCGCGTGC TAAGCACATC CTCCACCCTC 4380 

ACACGGGACT ACAACTCACT GACCCGCTCA GAACACTCAC ACTCGACCAC ACTGCCGAGG 4440 

GACTACTCCA CCCTCACCTC CGTCTCCTCC CACGACTCTC GCCTGACTGC TGGTGTGCCC 4500 

GACACGCCCA CCCGCCTGGT GTTCTCTGCC CTGGGGCCCA CATCTCTCAG AGTGAGCTGG 4560 

CAGGAGCCGC GGTGCGAGCG GCCGCTGCAG GGCTACAGTG TGGAGTACCA GCTGCTGAAC 4620 

GGCGGTGAGC TGCATCGGCT CAACATCCCC AACCCTGCCC AGACCTCGGT GGTGGTGGAA 4680 

GACCTCCTGC CCAACCACTC CTACGTGTTC CGCGTGCGGG CCCAGAGCCA GGAAGGCTGG 4740 

GGCCGAGAGC GTGAGGGTGT CATCACCATT GAATCCCAGG TGCACCCGCA GAGCCCACTG 4800 

TGTCCCCTGC CAGGCTCCGC CTTCACTTTG AGCACTCCCA GTGCCCCAGG CCCGCTGGTG 4860 

TTCACTGCCC TGAGCCCAGA CTCGCTGCAG CTGAGCTGGG AGCGGCCACG GAGGCCCAAT 4920 

GGGGATATCG TCGGCTACCT GGTGACCTGT GAGATGGCCC AAGGAGGAGG GCCAGCCACC 4980 

GCATTCCGGG TGGATGGAGA CAGCCCCGAG AGCCGGCTGA CCGTGCCGGG CCTCAGCGAG 5040 

AACGTGCCCT ACAAGTTCAA GGTGCAGGCC AGGACCACTG AGGGCTTCGG GCCAGAGCGC 5100 

GAGGGCATCA TCACCATAGA GTCCCAGGAT GGAGGACCCT TCCCGCAGCT GGGCAGCCGT 5160 

GCCGGGCTCT TCCAGCACCC GCTGCAAAGC GAGTACAGCA GCATCACCAC CACCCACACC 5220 

AGCGCCACCG AGCCCTTCCT AGTGGATGGG CCGACCCTGG GGGCCCAGCA CCTGGAGGCA 5280 

GGCGGCTCCC TCACCCGGCA TGTGACCCAG GAGTTTGTGA GCCGGACACT GACCACCAGC 5340 

GGAACCCTTA GCACCCACAT GGACCAACAG TTCTTCCAAA CTTGACCGCA CCCTGCCCCA 5400 

CCCCCGCCAT GTCCCACTAG GCGTCCTCCC GACTCCTCTC CCGGAGCCTC CTCAGCTACT 5460 

CCATCCTTGC ACCCCTGGGG GCCCAGCCCA CCCGCATGCA CAGAGCAGGG GCTAGGTGTC 5520 

TCCTGGGAGG CATGAAGGGG GCAAGGTCCG TCCTCTGTGG GCCCAAACCT ATTTGTAACC 5580 

AAAGAGCTGG GAGCAGCACA AGGACCCAGC CTTTGTTCTG CACTTAATAA ATGGTTTTGC 5640 
TACTG 

Seq ID NO: 561 Protein sequence 
Protein Accession ft: NP_000204.1 

1 11 21 31 41 51 

I I I I' I I 

MAGPRPSPWA RIiIiLAALISV SLSGTLANRC KKAPVKSCTB CVRVDKDCAY CTDEMFRDRR 60 

CNTQAELLAA GCQRESIWM ESSFQITEET QIDTTLRRSQ MSPQGLRVRL RPGEERHFEL 120 

EVFEPLESPV DLYILMDFSN SMSDDLDNLK KMGQNLARVL SQLTSDYTIG FGKFVDKVSV 180' 

PQTDMRPEKL KEPWPNSDPP FSFKNVISLT EDVDEFRNKL QGERISGNLD APEGGFDAIL 240 

QTAVCTRDIG WRPDSTHLLV FSTESAFHYE ADGANVLAGI MSRNDERCHL DTTGTYTQYR 300 

TQDYPSVPTL VRLLAKHNII PIFAVTNYSY SYYEKLHTYF PVSSLGVIGE DSSNIVELLE 360 

EAFNRIRSNL DIRALDSPRG LRTEVTSKMF QKTRTGSFHI RRGEVGIYQV QLRALEHVDG 420 

THVCQLPEDQ KGNIHLKPSF SDGLKMDAGI ICDVCTCELQ KEVRSARCSF NGDFVCGQCV 480 

CSEGWSGQTC NCSTGSLSDI QPCLREGEDK PCSGRGECQC GHCVCYGEGR YEGQFCEYDN 540 

FQCPRTSGFL CNDRGRCSMG QCVCEPGWTG PSCDCPLSNA TCIDSNGGIC NGRGHCECGR 600 

CHCHQQSLYT DTICEINYSA IHPGLCEDLR SCVQCQAWGT GEKKGRTCEE CNFKVKMVDE 660 

LKRAEEWVR CSFRDEDDDC TYSYTMEGDG APGPNSTVLV HKKKDCPPGS FWWLIPLLLL 720 

LLPLLALLLL LCWKYCACCK ACLALL.PCCN RGHMVGFKED HYMLRENLMA SDHLDTPMLR 780 

SGNLKGRDW RWKVTNNMQR PGFATHAASI NPTELVPYGL SLRLARLCTE NLLKPDTREC 840 

AQLRQEVEEN LNEVYRQISG VHKLQQTKFR QQPNAGKKQD HTIVDTVLMA PRSAKPALLK 900 

LTEKQVEQRA FHDLKVAPGY YTLTADQDAR GMVEFQEGVE LVDVRVPLFI RPEDDDEKQL 960 

LVEAIDVPAG TATLGRRLVN ITIIKEQARD WSFEQPEFS VSRGDQVARI PVIRRVLDGG 1020 

KSQVSYRTQD GTAQGNRDYI PVEGELLFQP GEAWKELQVK LLELQEVDSL LRGRQVRRFH 1080 

VQLSNPKFGA HLGQPHSTTI IIRDPDELDR SFTSQMLSSQ PPPHGDLGAP QNPNAKAAGS 1140 

RKIHFNWLPP SGKPMGYRVK YWIQGDSESE AHLLDSKVPS VELTNLYPYC DYEMKVCAYG 1200 

AQGEGPYSSL VSCRTHQEVP SEPGRLAFNV VSSTVTQLSW AEPAETNGEI TAYEVCYGLV 1260 

NDDNRPIGPM KKVLVDNPKN RMLLIENLRE SQPYRYTVKA RNGAGWGPER EAIINLATQP 1320 

KRPMSIPIIP DIPIVDAQSG EDYDSFLMYS DDVLRSPSGS QRPSVSDDTE HLVNGRMDFA 1380 

FPGSTNSLHR MTTTSAAAYG THLSPHVPHR VIiSTSSTLTR DYNSLTRSEH SHSTTLPRDY 1440 

STLTSVSSHD SRLTAGVPDT PTRLVFSALG PTSLRVSWQE PRCERPLQGY SVEYQLLNGG 150 0 

ELHRLNIPNP AQTSVWEDL LPNHSYVFRV RAQSQEGWGR EREGVITIES QVHPQSPLCP 1560 

LPGSAFTLST PSAPGPLVFT ALSPDSLQLS WERPRRPNGD IVGYLVTCEM AQGGGPATAF 1620 

RVDGDSPESR LTVPGLSENV PYKFKVQART TEGFGPEREG IITIESQDGG PFPQLGSRAG 1680 

LFQHPLQSEY SSITTTHTSA TEPFLVDGPT LGAQHLEAGG 8LTRHVTQEF VSRTLTTSGT 1740 
LSTHMDQQFF QT 

Seq ID NO: 562 DNA sequence 

Nucleic Acid Accession 8: NMJJ13332.1 

Coding sequence i 1 . . 63 

1 li 21 31 41 51 

I I I I I I 



399 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

GCACGAGGGC GCTTTTGTCT CCGGTGAGTT TTGTGGCGGG AAGCTTCTGC 
AGTAACCGAC TTTCCTCCGG ACTCCTGCAC GACCTGCTCC TACAGCCGGC 
CGGCTGTTCC CCCGGAGGGT CCAGAGGCCT TTCAGAAGGA GAAGGCAGCT 
GCAGAGGAGT AGGGTCCTTT CAGCCATGAA GCATGTGTTG AACCTCTACC 
GGTACTGACC CTACTCTCCA TCTTCGTTAG AGTGATGGAG TCCCTAGAAG 
GAGCCCATCG CCTGGGACCT CCTGGACCAC CAGAAGCCAA CTAGCCAACA 
CAAGGGCCTT CCAGACCATC CATCCAGAAG CATGTGATAA GACCTCCTTC 
ATATTTTGGA ACACTGACCT AGACATGTCC AGATGGGAGT CCCATTCCTA 
TGAGCACCGT TGTAACCAGA GAACTATTAC TAGGCCTTGA AGAACCTGTC 
CTCATTGCCT GGGCAAGGCC TGTTTAGGCC GGTTGCGGTG GCTCATGCCT 
CACTTTGGGA GGCTGAGGTG GGTGGATCAC CTGAGGTCAG GAGTTCGAGA 
CAACATGGCG AAACCCCATC TCTACTAAAA ATACAAAAGT TAGCTGGGTG 
GGCCTGTAAT CCCAGTTCCT TGGGAGGCTG AGGCGGGAGA ATTGCTTGAA 
GAGGTTGCAG TGAACCGAGA TCGCACTGCT GTACCCAGCC TGGGCCACAG 
CATCTCAAAA AAAAAAAGAA AAGAAAAAGC CTGTTTAATG CACAGGTGTG 
TTATGGCTAT GAGATAGGTT GATCTCGCCC TTACCCCGGG GTCTGGTGTA 
TCCTCAGCAG TATGGCTCTG ACATCTCTTA GATGTCCCAA CTTCAGCTGT 
TGATATTTTC AACCCTACTT CCTAAACATC TGTCTGGGGT TCCTTTAGTC 
TATGCTCAAT TATTTGGTGT TGAGCCTCTC TTCCACAAGA GCTCCTCCAT 
CAGTTGAAGA GGTTGTGTGG GTGGGCTGTT GGGAGTGAGG ATGGAGTGTT 
TTCTCATTTT ACATTTTAAA GTCGTTCCTC CAACATAGTG TGTATTGGTC 
GGTGGGATGC CAAAGCCTGC TCAAGTTATG GACATTGTGG CCACCATGTG 
TTTTTTCTAA CTAATAAAGT GGAATATATA TTTCAAAAAA AAAAAAAAAA 

Seq ID NO: 563 Protein sequence 
Protein Accession #: NPJ)37464.1 



PCT/US02/12476 



GCTGGTGCTT 
GATCCACTCC 
CTGTTTCTCT 
TGTTAGGTGT 
GCTTACTAGA 
CAGAGCCCAC 
CATACTGGCC 
GCAGACAAGC 
TAACTGGATG 
GTAATCCTAG 
CCAGCCTCGC 
TGGTGGCAGA 
CCCGGGGACG 
TGCAAGACTC 
AGTGGATTGC 
TGCTGTGCTT 
TGGGAGATGG 
TTGAATGTCT 
GTTTGGATAG 
CAGTGCCCAT 
TGAAGGGGGT 
GCTTAAATGA 
AA 



11 



21 



31 41 51 

I I I 

MKHVLNLiYLL GWLTLLSIF VRVMESLEGL LESPSPGTSW TTRSQIiANTE PTKGLPDHPS 

RSM 

Seq ID NO: 564 DNA sequence 

Nucleic Acid Accession ft: NM_023915.1 

Coding sequence : 2 50 .. 13 2 6 



1 
I 

GGCACGAGGG 
TCAAAGCTTA 
GTGAATGGAC 
CCCACGCCTC 
AACTGAAGAA 
CAAGAGAGTC 
AATGAATTTG 
TTGCTGAATG 
TTCTATCTCA 
ATAGTCCATG 
TCAGTTTTGT 
GATCGCTATC 
ACGAAGGTTT 
ATCCTGACAA 
CCTTTGGGGG 
GTGCTGGTGA 
AGGCAATTCA 
GTGGCTGTGT 
AGTCACTTAG 
ATTACACTTT 
TGTAGGTCAT 
ATCAGATCAC 
GTGTAGGCCT 
TTCATTATCC 



11 

I 

TTTCGTTTTC 
TTCTTAATTA 
AGCCAGCCAC 
AATCGTCCCC 
TGGGGTTCAA 
ACAATTCAGG 
ACACAATTGT 
GTTTAGCAGT 
AAAACATAGT 
ATGCAGGATT 
TTTATGCAAA 
TGAAGGTGGT 
TATCTGTTTG 
ATGGTCAGCC 
TCAAATGGCA 
TTCTGATCGG 
TAAGTCAGTC 
TTTTTACCTG 
ACAGGCTTTT 
TCTTGTCTGC 
TTTCAAGAAG 
TGCAAAGTGT 
TTTATTGTTT 
TTAAAAAAAA 



21 
I 

ATGCTTTACC 
GAGACAAGAA 
CACAATGAAA 
AAGTGTTTCC 
CTTGACGCTT 
CAACAGGAGC 
CTTGCCGGTG 
GTGGATCTTC 
GGTTGCAGAC 
TGGACCTTGG 
CATGTATACT 
CAAGCCATTT 
TGTTTGGGTG 
AACAGAGGAC 
TACGGCAGTC 
ATGTTACATA 
AAGCCGAAAG 
CTTTCTACCA 
AGATGAATCT 
GTGTAATGTT 
GCTGTTCAAA 
GAGAAGATCG 
GTTGGAATCG 
AA 



31 
I 

AGAAAATCCA 
ACCTGTTTCA 
GAAATCAAAC 
TGACACGCAT 
GCAAAATTAC 
GACGGGCCAG 
CTTTATCTCA 
TTCCACATTA 
CTCATAATGA 
TACTTCAAGT 
TCCATCGTGT 
GGGGACTCTC 
ATCATGGCTG 
AATATCCATG 
ACCTATGTGA 
GCCATATCCA 
CGAAAACATA 
TATCACTTGT 
GCACAAAAAA 
TGCCTGGATC 
AAATCAAATA 
GAAGTTCGCA 
ATATGTACAA 



41 
I 

CTTCCCTGCC 
ACTTGAAGAC 
CAGGAATAAC 
CTTTGCTTAC 
CAAATAACGA 
GAAAGAACAC 
TTATATTTGT 
GGAATAAAAC 
CGCTGACATT 
TTATTCTCTG 
TCCTTGGGCT 
GGATGTACAG 
TTTTGTCTTT 
ACTGCTCAAA 
ACAGCTGCTT 
GGTACATCCA 
ACCAGAGCAT 
GCAGAATTCC 
TCCTATATTA 
CAATAATTTA 
TCAGAACCAG 
TATATTATGA 
AGTGTAAATA 



51 
I 

GACCTTAGTT 
ACCGTATGAG 
CTATGCTGAA 
AGTGCATCAC 
GCTGCACGGC 
CACCCTTCAC 
GGCAAGCATC 
CAGCTTCATA 
TCCATTTCGA 
CAGATACA CT 
GATAAGCATT 
CATAACCTTC 
GCCAAACATC 
ACTTAAAAGT 
GTTTGTGGCC 
CAAATCCAGC 
CAGGGTTGTT 
TTTTACTTTT 
CTGCAAAGAA 
CTTTTTCATG 
GAGTGAAAGC 
TTACACTGAT 
AATGTTTCTT 



Seq ID NO: 565 Protein sequence 
Protein Accession ft: NP_076404 



I 

MGFNLTLAKL 
GLAVWIPPHI 
FYANMYTSIV 
NGQPTEDNIH 
ISQSSRKRKH 
FLSACNVCLD 



11 



21 



RNKTSPIPYL 
FLGLISIDRY 
DCSKLKSPLG 
NQSIRVWAV 
PIIYFFMCRS 



KNIWADLIM 
LKWKPPGDS 
VKWHTAVTYV 
FFTCFLPYHL 
FSRRLFKKSN 



31 
I 

GKNTTLHNEF 
TLTFPFRIVH 
RMYSITFTKV 
NSCLFVAVLV 
CRIPFTFSHL 
IRTRSESIRS 



41 
I 

DTIVLPVLYL 
DAGFGPWYFK 
LSVCVWVIMA 
ILIGCYIAIS 
DRLLDESAQK 
LQSVRRSEVR 



51 
I 

IIFVASILLN 
FILCRYTSVL 
VLSLPNIILT 
RYIHKSSRQF 
ILYYCKEI7L 
IYYDYTDV 



Seq ID NO: 566 DNA sequence 

Nucleic Acid Accession #: NM_00S365.1 

Coding sequence: 1..948 



1 
I 

ATGTCTCTCG 
GAGGACTTGG 
TCCTCTGACA 
CCTCAGGGAG 
GAGGGCTCCA 
GAGTTCATGT 



11 
I 

AGCAGAGGAG 
GCCTGATGGG 
GCAAGGAGGA 
GCGCTTCCTC 
GCAGTCAAGA 
TCCAAGAAGC 



21 
I 

TCCGCACTGC 
TGCACAGGAA 
GGAGGTGTCT 
CTCCATTTCC 
AGAGGAAGAG 
ACTGAAATTG 



31 
I 

AAGCCTGATG 
CCCACAGGCG 
GCTGCTGGGT 
GTCTACTACA 
CCAAGCTCCT 
AAGGTGGCTG 



41 
I 

AAGACCTTGA 
AGGAGGAGGA 
CATCAAGTCC 
CTTTATGGAG 
CGGTCGACCC 
AGTTGGTTCA 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



60 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 



60 
120 
180 
240 
300 



51 
I 

AGCCCAAGGA 
GACTACCTCC 
TCCCCAGAGT 
CCAATTCGAT 
AGCTCAGCTG 
TTTCCTGCTC 



60 
120 
180 
240 
300 
360 



400 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

CACAAATATC GAGTCAAGGA GCCGGTCACA AAGGCAGAAA TGCTQGAGAG CGTCATCAAA 
AATTACAAQC GCTACTTTCC TGTGATCTTC CGCAAAGCCT CCGAGTTCAT GCAGGTGATC 
TTTGGCACTG ATGTGAAGGA GGTGGACCCC GCCGGCCACT CCTACATCCT TGTCACTGCT 
CTTGGCCTCT CGTGCGATAG CATGCTGGGT GATGGTCATA GCATGCCCAA GGCCGCCCTC 
CTGATCATTG TCCTGGGTGT GATCC7AACC AAAGACAACT GOGCCCCTGA AGAGGTTATC 
TGGGAAGCGT TGAGTGTGAT GGGGGTGTAT GTTGGGAAGG AGCACATGTT CTACGGGGAG 
CCCAGGAAGC TGCTCACCCA AGATTGGGTG CAGGAAAACT ACCTGGAGTA COGGCAGGTG 
CCCGGCAGTG ATCCTGCGCA CTAOGAGTTC CTGTGGGGTT CCAAGGCCCA CGCTGAAACC 
AGCTATGAGA AGGTCATAAA TTATTTGGTC ATGCTCAATG CAAGAGAGCC CATCTGCTAC 
CCATCCCTTT ATGAAGAGGT TTTGGGAGAG GAGCAAGAGG GAGTCTGA 



PCT/US02/12476 



Seq ID HO: 567 Protein sequence 
Protein Accession 8: NP_005356.1 



MSLEQRSPHC 
PQGGASSSIS 
HKYRVKEPVT 
LGLSCDSMLG 
PRKLLTQDWV 
PSLYEEVLGE 



11 
I 

KPDEDLEAQG 
VYYTLWSQFD 
KAEMLESVIK 
DGHSMPKAAL 
QENYLEYRQV 
EQEGV 



21 



31 
I 



EDLGLMGAQE PTGEEEETTS 
EGSSSQEEEB PSSSVDPAQL 
NYKRYFPVIF GKASEFMQVI 
LIIVLGVILT KDNCAPEEVI 
PGSDPAHYEF LWGSKAHAET 



41 
I 



51 



SSDSKEEEVS AAGSSSPPQS 
EFMFQEALKL KVAELVHFLL 
FGTDVKEVDP AGHSYILVTA 
WEALSVMGVY VGKEHMFYGE 
SYEKVINYLV MLNAREPICY 



Seq ID NO i 566 DNA sequence 
Nucleic Acid Accession 8: NMJU4400 
Coding sequence: 86.. 1126 



GGTTACTCAT 
GACGCCAAGG 
GATCTGGACT 
GTGCTACAGC 
GAAGTGCGCG 
CGGACAATTC 
CGGCCTGGAT 
CTGCAACGCC 
ATACCCGCCC 
GGGTACATCG 
CTTCGACGGC 
CTGTGTCCAG 
TGGCTCCTGT 
CCCTCGAATC 
CACATCTGTC 
GCCAGCGCCA 
GGAGCCCAGG 
TCCTGCAAAA 
ATTGGCAGCC 
AAATTTCCCT 
CCCACCACTG 
CTTCTGCTGC 
GGGTGTTCTA 
TCCTCTTGTG 
AGGATGCTAA 
GGTGGGACAA 
ATCGGTTCCC 
CTTATGTCTG 
TTGTATAGTG 



11 
I 

CCTGGGCTCA 
GAGCAGGACG 
GCAGGCTGGC 
TGCGTGCAGA 
CCGGGCGTGG 
TCGCTGGCAG 
CTTCACGGGC 
AAGCTCAACC 
AACGGCGTGG 
CCGCCGGTCG 
AACGTCACCT 
GATGAATTCT 
TGCCAGGGGT 
CCACCCCTTG 
ACCACTTCTA 
ACCAGTCAGA 
TTGACTGGAG 
GGGGGGCCCC 
CTTCTGTTGG 
CTCACCTACT 
GACTGGGCTG 
GCTGGTTTGC 
GCTTTTTGAG 
ATGTTAGGAC 
GCTTCCTACT 
TGGCTCCCCA 
CATATGTCTT 
TGTGTGATCA 
AAAAAAAA 



21 
I 

GGTAAGAGGG 
GAGCCATGGA 
TGCTGCTGCT 
AAGCAGATGA 
ACGTCTGCAC 
TGCSGGGTTG 
TTCTGGCGTT 
TCACCTCGCG 
AGTGCTACAG 
TGAGCTGCTA 
TGACGGCAGC 
GCACTCGGGA 
CCCGCTGTAA 
TCCGGCTGCC 
CCTCGGCCCC 
CTCCGAGACA 
GCGCCGCTGG 
AGCAGCCCCA 
CCGTGGCTGC 
TCTCTGGCCC 
GCCCAGCCCC 
GGCTTTGGGA 
GACAGCTCCT 
AGAGTGAGAG 
CACTTTCTCC 
CTCTAAGCAC 
CCTTACTAGA 
GTTTCTGGCA 



31 

I 

CCCGAGCTCG 
CCCCGCCAGG 
GCTGCTTCGC 
CGGATGCTCC 
CGAGGCCGTG 
CGGTTCGGGA 
CATCCAGCTG 
GGCGCTCGAC 
CTGTGTGGGC 
CAACGCCAGC 
TAATGTGACT 
TGGAGTAACA 
CTCTGACCTC 
CCCTCCAGAG 
AGTGAGACCC 
GGGAGTAGAA 
CCACCAGGAC 
TAATAAAGGC 
TGGTGTCCTA 
TGGGTACCCC 
TGTTTTTCCA 
AATAAAATAC 
GTATCCTTCT 
AAGTCAGCTG 
TAGCCAGCCT 
TGCCTCCCCT 
CTGTGAGCTC 
CATAAATGCC 



Seq ID NO: 569 Protein sequence 
Protein Accession #: NP 055215 



1 
I 

MDPARKAGAQ 
CTEAVGAVET 
SRALDPAGNE 
AANVTVSLPV 
LPPPEPTTVA 
AGHQDRSNSG 



11 
I 

AMIWTAGWLIj 
IHGQFSLAVX 
SAYPPNGVEC 
RGCVQDEFCT 
STTSVTTSTS 
QYPAKGGPQQ 



21 
I 

LLLLRGGAQA 
GCGSGLPGKN 
YSCVGLSREA 
RDGVTGPGPT 
APVRPTSTTK 
PHNKGCVAPT 



31 

I 

LECYSCVQKA 
DRGLDLHGLL 
CQGTSPPWS 
LSGSCCQGSR 
PMPAPTSQTP 
AGLAALLLAV 



Seq ID NO: 570 DNA sequence 
Nucleic Acid Accession #: NM_005329.1 
Coding sequence: 1..1662 



ATGCCGGTGC 
GTGCTGGGTG 
CACTACCTGT 
CTTTTTGCCT 
TCCCCGCGGC 
TTGCGCAAGT 
GTGGTGGATG 
GGCGGCACCG 
GGTGAGACGG 
AGCACCTTCT 



11 
I 

AGCTGACGAC 
GCATCCTGGC 
CCTTCGGCCT 
TCCTGGAGCA 
GGGGCTCGGT 
GCCTGCGCTC 
GCAACCGCCA 
AGCAGGCCGG 
AGGCCAGCCT 
CGTGCATCAT 



21 
I 

AGCCCTGCGT 
AGCCTATGTG 
GTACGGCGCC 
CCGGCGCATG 
GGCACTGTGC 
GGCCCAGCGC 
GGAGGACGCC 
CTTCTTTGTG 
GCAGGAGGGC 
GCAGAAGTGG 



31 
I 

GTGGTGGGCA 
ACGGGCTACC 
ATCCTGGGCC 
CGACGTGCCG 
ATTGCCGCGT 
ATCTCCTTCC 
TACATGCTGG 
TGGCGCAGCA 
ATGGACCGTG 
GGAGGCAAGC 



41 

I 

GAGGCGGCAC 
AAAGCAGGTG 
GGAGGAGCGC 
CCGAACAAGA 
GGGGCGGTGG 
CTCCCCGGCA 
CAGCAATGCG 
CCGGCAGGTA 
CTGAGCCGGG 
GATCATGTCT 
GTGTCCTTGC 
GGCCCAGGGT 
CGCAACAAGA 
CCCACGACTG 
ACATCCACCA 
CACGAGGCCT 
CGCAGCAATT 
TGTGTGGCTC 
CTGTGAGCTT 
TCTTCTCATC 
ACATTCCCCA 
CGTTGTATAT 
CATCCTTGTC 
TCACGGGGAA 
GGACTTTGGA 
ACTCCCCGCA 
CTCGAGGGCA 
TCAATAAAGA 



41 
I 

DDGCSPNKMK 
AFIQLQQCAQ 
CYNASDHVYK 
CNSDLRNKTY 
RQGVEHEASR 
AAGVLL 



41 
I 

CCAGCCTGTT 
AGTTCATCCA 
TGCACCTGCT 
GCCAGGCCCT 
ACCAGGAGGA 
CTGACCTCAA 
ACATCTTCCA 
ACTTCCATGA 
TGCGGGATGT 
GCGAGGTCAT 



420 
480 
540 
600 
660 
720 
780 
840 
900 



60 
120 
180 
240 
300 



51 
I 

ACCCAGGGGG 
CCCAGGCCAT 
AGGCCCTGGA 
TGAAGACAGT 
AGACCATCCA 
AGAATGACCG 
CTCAGGATCG 
ATGAGAGTGC 
AGGCGTGCCA 
ACAAGGGCTG 
CTGTCCGGGG 
TCACGCTCAG 
CCTACTTCTC 
TGGCCTCAAC 
CCAAACCCAT 
CCCGGGATGA 
CAGGGCAGTA 
CCACAGCTGG 
CTCCACCTGG 
ACTTCCTGTT 
GTATCCCCAG 
ATTCTGGCAG 
TCTCCGCTTG 
GGTGAGAGAG 
GCGTGGGGTG 
TCTTTGGGGA 
GGGACCGTGC 
TTTAATTACT 



51 
I 

TVKCAPGVDV 
DRCNAKLNLT 
GCFDGNVTLT 
FSPRIPPLVR 
DEEPRLTGGA 



51 
I 

TGCCCTGGCA 
CACGGAAAAG 
CATTCAGAGC 
GAAGCTGCCC 
CCCTGACTAC 
GGTGGTCATG 
CGAGGTGCTG 
GGCAGGCGAG 
GGTGCGGGCC 
GTACACGGCC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 



60 
120 
180 
240 
300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



401 



WO 02/086443 

TTCAAGGCCC TCGGCGATTC GGTGGACTAC ATCCAGGTGT GCGACTCTGA CACTGTGCTG 660 

GATCCAGCCT GCACCATCGA GATGCTTCGA GTCCTGGAGG AGGATCCCCA AGTAGGGGGA 720 

GTCGGGGGAG ATGTCCAGAT CCTCAACAAG TACGACTCAT GGATTTCCTT CCTGAGCAGC 780 

GTGOGGTACT GGATGGCCTT CAAOGTGGAG OGGGCCTGCC AGTCCTACTT TGGCTGTGTG 840 

CAGTGTATTA GTGGGCCCTT GGGCATGTAC CGCAACAGCC TCCTCCAGCA GTTCCTGGAG 900 

GACTGGTACC ATCAGAAGTT CCTAGGCAGC AAGTGCAGCT TOGGGGATGA CCGGCACCTC 960 

ACCAACCGAG TCCTGAGCCT TGGCTACOGA ACTAAGTATA CCGCGCGCTC CAAGTGCCTC 1020 

ACAGAGACCC CCACTAAGTA CCTCCGGTGG CTCAACCAGC AAACCCGCTG GAGCAAGTCT 1080 

TACTTCOGGG AGTGGCTCTA CAACTCTCTG TGGTTCCATA AGCACCACCT CTGGATGACC 1140 

TACGAGTCAG TGGTCACGGG TTTCTTCCCC TTCTTCCTCA TTGCCAOGGT TATACAGCTT 1200 

TTCTACCGGG GCCGCATCTG GAACATTCTC CTCTTCCTGC TGACGGTGCA GCTGGTGGGC 1260 

ATTATCAAGG CCACCTACGC CTGCTTCCTT CGGGGCAATG CAGAGATGAT CTTCATGTCC 1320 

CTCTACTCCC TCCTCTATAT GTCCAGCCTT CTGCCGGCCA AGATCTTTGC CATTGCTACC 1380 

ATCAACAAAT CTGGCTGGGG CACCTCTGGC CGAAAAACCA TTGTGGTGAA CTTCATTGGC 1440 

CTCATTCCTO TGTCCATCTG GGTGGCAGTT CTCCTGGGAG GGCTGGCCTA CACAGCTTAT 1500 

TGCCAGGACC TGTTCAGTGA GACAGAGCTA GCCTTCCTTG TCTCTGGGGC TATACTGTAT 1560 

GGCTGCTACT GGGTGGCCCT CCTCATGCTA TATCTGGCCA TCATCGCCCG GCGATGTGGG 1620 
AAGAAGCCGG AGCAGTACAG CTTGGCTTTT GCTGAGGTGT GA 

Seq ID NO: 571 Protein sequence 
Protein Accession #: NP_005320.l 

1 11 21 31 41 51 

I I I I I I 

MPVQLTTALR WGTSLFALA VLGGILAAYV TGYQFIHTEK HYLSFGLYGA ILGLHLLIQS 60 

LFAFLEHRRM RRAGQALKLP SPRRGSVALC IAAYQEDPDY LRKCLRSAQR ISFPDLKWM 120 

WDGNRQEDA YMLD1FHEVL GGTEQAGFFV WRSNFHEAGE GETEASLQEG MDRVRDWRA 180 

STFSCIMQKW GGKREVMYTA FKALGDSVDY IQVCDSDTVL DPACTIEMLR VLEEDPQVGG 240 

VGGDVQILNK YDSWISFLSS VRYWMAFNVE RACQSYFGCV QC1SGPLGMY RNSLLQQFLE 300 

DWYHQKFLGS KCSFGDDRHL TNRVLSLGYR TKYTARSKCL TETPTKYLRW LNQQTRWSKS 360 

YFREWLYNSI* WFHKHHLWMT YESWTGFFP FFLIATVIQL FYRGRIWNIL IiFLLTVQLVG 420 

IIKATYACFL RGNAEMIFMS LYSLLYMSSL LPAKIFAIAT INKSGWGTSG RKTIWNFIG 480 

LIPVSIWVAV LLGGLAYTAY CQDLFSETEL AFLVSGAILY GCYWVALLML YLAI IARRCG 540 
KKPEQYSLAF AEV 

Seq ID NO i 572 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 148-7095 

1 11 21 31 41 51 

I I I I I I 

CACACATACG CACGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 120 

CGGCGAGGGG CCGCAGACCG TCTGGAAATG CGAATCCTAA AGCGTTTCCT CGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACCGT 480 

GTCAGCGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGCGGAC CGATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG CGATTATTGA TGGAGTCGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTCGAGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGCGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1S00 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TCGCATAGGG 1560 

ACGAAATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATACAATG GTGAGACACC TCTTCAACCT TCCTACAGTA GTGAAGTCTT TCCTCTAGTC 2460 

ACCCCTTTGT TGCTTGACAA TCAGATCCTC AACACTACCC CTGCTGCTTC AAGTAGTGAT 2520 

TCGGCCTTGC ATGCTACGCC TGTATTTCCC AGTGTCGATG TGTCATTTGA ATCCATCCTG 2580 

TCTTCCTATG ATGGTGCACC TTTGCTTCCA TTTTCCTCTG CTTCCTTCAG TAGTGAATTG 2640 

TTTCGCCATC TGCATACAGT TTCTCAAATC CTTCCACAAG TTACTTCAGC TACCGAGAGT 2700 

GATAAGGTGC CCTTGCATGC TTCTCTGCCA GTGGCTGGGG GTGATTTGCT ATTAGAGCCC 2760 
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AGCCTTGCTC AGTATTCTGA TGTGCTGTCC ACTACTCATG CTGCTTCAGA GACGCTGGAA 2820 

TTTGGTAGTG AATCTGGTGT TCTTTATAAA ACGCTTATGT TTTCTCAAGT TGAACCACCC 2880 

AGCAGTGATQ CCATGATGCA TGCACGTTCT TCAGGGCCTG AACCTTCTTA TGCCTTGTCT 2940 

GATAATGAGG GCTCCCAACA CATCTTCACT GTTTCTTACA GTTCTGCAAT ACCTGTGCAT 3000 

GATTCTGTGG GTGTAACTTA TCAGGGTTCC TTATTTAGCG GCCCTAGCCA TATACCAATA 3060 

CCTAAGTCTT OGTTAATAAC CCCAACTGCA TCATTACTGC AGCCTACTCA TGCCCTCTCT 3120 

GGTGATGGGG AATGGTCTGG AGCCTCTTCT GATAGTGAAT TTCTTTTACC TGACACAGAT 3180 

GGGCTGACAG CCCTTAACAT TTCTTCACCT GTTTCTGTAG CTGAATTTAC ATATACAACA 3240 

TCTGTGTTTG GTGATGATAA TAAGGCGCTT TCTAAAAGTG AAATAATATA TGGAAATGAG 3300 

ACTGAACTGC AAATTCCTTC TTTCAATGAG ATGGTTTACC CTTCTGAAAG CACA GTCA TG 3360 

CCCAACATGT ATGATAATGT AAATAAGTTG AATGCGTCTT TACAAGAAAC CTCTGTTTCC 3420 

ATTTCTAGCA CCAAGGGCAT GTTTCCAGGG TCCCTTGCTC ATACCACCAC TAAGGTTTTT 3480 

GATCATGAGA TTAGTCAAGT TCCAGAAAAT AACTTTTCAG TTCAACCTAC ACATACTGTC 3540 

TCTCAAGCAT CTGGTGACAC TTCGCTTAAA CCTGTGCTTA GTGCAAACTC AGAGCCAGCA 3600 

TCCTCTGACC CTGCTTCTAG TGAAATGTTA TCTCCTTCAA CTCAGCTCTT ATTTTATGAG 3660 

ACCTCAGCTT CTTTTAGTAC TGAAGTATTG CTACAACCTT CCTTTCAGGC TTCTGATGTT 3720 

GACACCTTGC TTAAAACTGT TCTTCCAGCT GTGCCCAGTG ATCCAATATT GGTTGAAACC 3780 

CCCAAAGTTG ATAAAATTAG TTCTACAATG TTGCATCTCA TTGTATCAAA TTCTGCTTCA 3840 

AGTGAAAACA TGCTGCACTC TACATCTGTA CCAGTTTTTG ATGTGTCGCC TACTTCTCAT 3900 

ATGCACTCTG CTTCACTTCA AGGTTTGACC ATTTCCTATG CAAGTGAGAA ATATGAACCA 3960 

GTTTTGTTAA AAAGTGAAAG TTCCCACCAA GTGGTACCTT CTTTGTACAG TAATGATGAG 402 0 

TTGTTCCAAA CGGCCAATTT GGAGATTAAC CAGGCCCATC CCCCAAAAGG AAGGCATGTA 4080 

TTTGCTACAC CTGTTTTATC AATTGATGAA CCATTAAATA CACTAATAAA TAAGCTTATA 4140 

CATTCCGATG AAATTTTAAC CTCCACCAAA AGTTCTGTTA CTGGTAAGGT ATTTGCTGGT 4200 

ATTCCAACAG TTGCTTCTGA TACATTTGTA TCTACTGATC ATTCTGTTCC TATAGGAAAT 4260 

GGGCATGTTG CCATTACAGC TGTTTCTCCC CACAGAGATG GTTCTGTAAC CTCAACAAAG 4320 

TTGCTGTTTC CTTCTAAGGC AACTTCTGAG CTGAGTCATA GTGCCAAATC TGATGCCGGT 4380 

TTAGTGGGTG GTGGTGAAGA TGGTGACACT GATGATGATG GTGATGATGA TGATGATGAC 4440 

AGAGGTAGTG ATGGCTTATC CATTCATAAG TGTATGTCAT GCTCATCCTA TAGAGAATCA 4500 

CAGGAAAAGG TAATGAATGA TTCAGACACC CACGAAAACA GTCTTATGGA TCAGAATAAT 4560 

CCAATCTCAT ACTCACTATC TGAGAATTCT GAAGAAGATA ATAGAGTCAC AAGTGTATCC 4620 

TCAGACAGTC AAACTGGTAT GGACAGAAGT CCTGGTAAAT CACCATCAGC AAATGGGCTA 4680 

TCCCAAAAGC ACAATGATGG AAAAGAGGAA AATGACATTC AGACTGGTAG TGCTCTGCTT 4740 

CCTCTCAGCC CTGAATCTAA AGCATGGGCA GTTCTGACAA GTGATGAAGA AAGTGGATCA 4800 

GGGCAAGGTA CCTCAGATAG CCTTAATGAG AATGAGACTT CCACAGATTT CAGTTTTGCA 4860 

GACACTAATG AAAAAGATGC TGATGGGATC CTGGCAGCAG GTGACTCAGA AATAACTCCT 4920 

GGATTCCCAC AGTCCCCAAC ATCATCTGTT ACTAGCGAGA ACTCAGAAGT GTTCCACGTT 4980 

TCAGAGGCAG AGGCCAGTAA TAGTAGCCAT GAGTCTCGTA TTGGTCTAGC TGAGGGGTTG 5040 

GAATCCGAGA AGAAGGCAGT TATACCCCTT GTGATCGTGT CAGCCCTGAC TTTTATCTGT 5100 

CTAGTGGTTC TTGTGGGTAT TCTCATCTAC TGGAGGAAAT GCTTCCAGAC TGCACACTTT 5160 

TACTTAGAGG ACAGTACATC CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 5220 

ATTTCAGATG ATGTCGGAGC AATTCCAATA AAGCACTTTC CAAAGCATGT TGCAGATTTA 5280 

CATGCAAGTA GTGGGTTTAC TGAAGAATTT GAGACACTGA AAGAGTTTTA CCAGGAAGTG 5340 

CAGAGCTGTA CTGTTGACTT AGGTATTACA GCAGACAGCT CCAACCACCC AGACAACAAG 5400 

CACAAGAATC GATACATAAA TATCGTTGCC TATGATCATA GCAGGGTTAA GCTAGCACAG 5460 

CTTGCTGAAA AGGATGGCAA ACTGACTGAT TATATCAATG CCAATTATGT TGATGGCTAC 5520 

AACAGACCAA AAGCTTATAT TGCTGCCCAA GGCCCACTGA AATCCACAGC TGAAGATTTC 5580 

TGGAGAATGA TATGGGAACA TAATGTGGAA GTTATTGTCA TGATAACAAA CCTCGTGGAG 5640 

AAAGGAAGGA GAAAATGTGA TCAGTACTGG CCTGCCGATG GGAGTGAGGA GTACGGGAAC 5700 

TTTCTGGTCA CTCAGAAGAG TGTGCAAGTG CTTGCCTATT ATACTGTGAG GAATTTTACT 5760 

CTAAGAAACA CAAAAATAAA AAAGGGCTCC CAGAAAGGAA GACCCAGTGG ACGTGTGGTC 5820 

ACACAGTATC ACTACACGCA GTGGCCTGAC ATGGGAGTAC CAGAGTACTC CCTGCCAGTG 5880 

CTGACCTTTG TGAGAAAGGC AGCCTATGCC AAGCGCCATG CAGTGGGGCC TGTTGTCGTC 5940 

CACTGCAGTG CTGGAGTTGG AAGAACAGGC ACATATATTG TGCTAGACAG TATGTTGCAG 6000 

CAGATTCAAC ACGAAGGAAC TGTCAACATA TTTGGCTTCT TAAAACACAT CCGTTCACAA 6060 

AGAAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACTGGTTGAG 6120 

GCCATACTTA GTAAAGAAAC TGAGGTGCTG GACAGTCATA TTCATGCCTA TGTTAATGCA 6180 

CTCCTCATTC CTGGACCAGC AGGCAAAACA AAGCTAGAGA AACAATTCCA GCTCCTGAGC 6240 

CAGTCAAATA TACAGCAGAG TGACTATTCT GCAGCCCTAA AGCAATGCAA CAGGGAAAAG 6300 

AATCGAACTT CTTCTATCAT CCCTGTGGAA AGATCAAGGG TTGGCATTTC ATCCCTGAGT 6360 

GGAGAAGGCA CAGACTACAT CAATGCCTCC TATATCATGG GCTATTACCA GAGCAATGAA 6420 

TTCATCATTA CCCAGCACCC TCTCCTTCAT ACCATCAAGG ATTTCTGGAG GATGATATGG 6480 

GACCATAATG CCCAACTGGT GGTTATGATT CCTGATGGCC AAAACATGGC AGAAGATGAA 6540 

TTTGTTTACT GGCCAAATAA AGATGAGCCT ATAAATTGTG AGAGCTTTAA GGTCACTCTT 6600 

ATGGCTGAAG AACACAAATG TCTATCTAAT GAGGAAAAAC TTATAATTCA GGACTTTATC 6660 

TTAGAAGCTA CACAGGATGA TTATGTACTT GAAGTGAGGC ACTTTCAGTG TCCTAAATGG 6720 

CCAAATCCAG ATAGCCCCAT TAGTAAAACT TTTGAACTTA TAAGTGTTAT AAAAGAAGAA 6780 

GCTGCCAATA GGGATGGGCC TATGATTGTT CATGATGAGC ATGGAGGAGT GACGGCAGGA 6840 

ACTTTCTGTG CTCTGACAAC CCTTATGCAC CAACTAGAAA AAGAAAATTC CGTGGATGTT 6900 

TACCAGGTAG CCAAGATGAT CAATCTGATG AGGCCAGGAG TCTTTGCTGA CATTGAGCAG 6960 

TATCAGTTTC TCTACAAAGT GATCCTCAGC CTTGTGAGCA CAAGGCAGGA AGAGAATCCA 7020 

TCCACCTCTC TGGACAGTAA TGGTGCAGCA TTGCCTGATG GAAATATAGC TGAGAGCTTA 7080 

GAGTCTTTAG TTTAACACAG AAAGGGGTGG GGGGACTCAC ATCTGAGCAT TGTTTTCCTC 7140 

TTCCTAAAAT TAGGCAGGAA AATCAGTCTA GTTCTGTTAT CTGTTGATTT CCCATCACCT 7200 

GACAGTAACT TTCATGACAT AGGATTCTGC CGCCAAATTT ATATCATTAA CAATGTGTGC 7260 

CTTTTTGCAA GACTTGTAAT TTACTTATTA TGTTTGAACT AAAATGATTG AATTTTACAG 7320 

TATTTCTAAG AATGGAATTG TGGTATTTTT TTCTGTATTG ATTTTAACAG AAAATTTCAA 7380 

TTTATAGAGG TTAGGAATTC CAAACTACAG AAAATGTTTG TTTTTAGTGT CAAATTTTTA 7440 

GCTGTATTTG TAGCAATTAT CAGGTTTGCT AGAAATATAA CTTTTAATAC AGTAGCCTGT 7500 

AAATAAAACA CTCTTCCATA TGATATTCAA CATTTTACAA CTGCAGTATT CACCTAAAGT 7560 

AGAAATAATC TGTTACTTAT TGTAAATACT GCCCTAGTGT CTCCATGGAC CAAATTTATA 7620 

TTTATAATTG TAGATTTTTA TATTTTACTA CTGAGTCAAG TTTTCTAGTT CTGTGTAATT 7680 

GTTTAGTTTA ATGACGTAGT TCATTAGCTG GTCTTACTCT ACCAGTTTTC TGACATTGTA 7740 

TTGTGTTACC TAAGTCATTA ACTTTGTTTC AGCATGTAAT TTTAACTTTT GTGGAAAATA 7800 

GAAATACCTT CATTTTGAAA GAAGTTTTTA TGAGAATAAC ACCTTACCAA ACATTGTTCA 7860 

AATGGTTTTT ATCCAAGGAA TTGCAAAAAT AAATATAAAT ATTGCCATTA AAAAAAAAAA 7920 
AAAAAAAAAA AAAAAAAAAA AAAA 
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Seq ID NO: 573 Protein' sequence: 
Protein Accession #: Eos sequence 

1 11 21 31 41 51 

I I I I I I 

MRILKRFLAC IQLLCVCRLD WANGYYRQQR KLVEEIGWSY TGALNQKNWG KKYPTCNSPK 60 

QSPINIDEDL TQVNVNLKKL KFQGWDKTSL ENTFIHNTGK TVEINLTNDY RVSGGVSEMV 120 

FKASKITFHW GKCNMSSDGS EHSLEGQKFP LEMQIYCFDA DRFSSFEEAV KGKGKLRALS 180 

ILFEVGTEEH IiDPKAIIDGV ESVSRFGKQA ALDPFILLNL LPNSTDKYYI YNGSLTSPPC 240 

TDTVDWIVPK DTVSISESQL AVFCEVLTMQ QSGYVMLMDY LQNNFREQQY KFSRQVFSSY 300 

TGKEEIHEAV CSSEPENVQA DPENYTSLLV TWERPRWYD TMI EKFAVLY QQLDGEDQTK 360 

HEFLTDGYQD LGAILNNLLP NMSYVLQIVA ICTNGLYGKY SDQLIVDMPT DNPELDLFPE 420 

LIGTEEIIKE EEEGKDIEEG AIVNPGRDSA TNQIRKKEPQ ISTTTHYNRI GTKYNEAKTN 480 

RSPTRGSEFS GKGDVPNTSL NSTSQPVTKL ATEKDISLTS QTVTELPPHT VEGTSASLND 540 

GSKTVLRSPH MNLSGTAESL NTVSITEYEE ESLLTSFKLD TGAEDSSGSS PATSAIPFIS 600 

ENISQGYIFS SENPETITYD VLIPESARNA SEDSTSSGSE ESLKDPSMEG NVWFPSSTDI 660 

TAQPDVGSGR ESFLQTNYTE IRVDESEKTT KSFSAGPVMS QGPSVTDLEM PHYSTFAYFP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNGETPLQ PSYSSEVFPL VTPLLLDNQI 780 

LNTTPAASSS DSALHATPVF PSVDVSFESI LSSYDGAPLL PFSSASFSSE LFRHLHTVSQ 840 

ILPQVTSATE SDKVPLHASL PVAGGDLLLE PSLAQYSDVL STTHAASETL EFGSESGVLY 900 

KTLMFSQVEP PSSDAMMHAR SSGPEPSYAL SDNEGSQHIF TVSYSSAIPV HDSVGVTYQG 960 

SLFSGPSHIP IPKSSLITPT ASLLQPTHAL SGDGEWSGAS SDSEFLIiPDT DGLTALNISS 1020 

PVSVAEFTYT TSVFGDDNKA LSKSEI IYGN ETELQIPSFN EMVYPSESTV MPNMYDNVNK 1080 

LNASLQETSV SISSTKGMFP GSIiAHTTTKV FDHEISQVPE NNFSVQPTHT VSQASGDTSL 1140 

KPVLSANSEP ASSDPASSEM LSPSTQLLFY ETSASFSTEV LLQPSFQASD VDTLLKTVLP 1200 

AVPSDPILVE TPKVDKISST MLHLIVSNSA SSENMLHSTS VPVFDVSPTS HMHSASLQGL 1260 

TISYASEKYE PVLLKSESSH QWPSLYSND ELFQTANLEI NQAHPPKGRH VFATPVLSID 1320 

EPLNTLINKI* IHSDEILTST KSSVTGKVFA GIPTVASDTF VSTDHSVPIG NGHVAITAVS 1380 

PHRDGSVTST KLLFPSKATS ELSHSAKSDA GLVGGGEOGD TDDDGDDDDD DRGSDGLSIH 1440 

KCMSCSSYRE SQEKVMNDSD THENSLMDQN NPISYSLSEN SEEDNRVTSV SSDSQTGMDR 1500 

SPGKSPSANG LSQKHNDGKE ENDIQTGSAL LPLSPESKAW AVLTSDEESG SGQGTSDSLN 1560 

ENETSTDFSF ADTNEKDADG ILAAGDSEIT PGFPQSPTSS VTSENSEVFH VSEAEASNSS 1620 

HESRIGLAEG LESEKKAVIP LVIVSALTFI CliWLVGILI YWRKCFQTAH FYLEDSTSPR 1680 

VISTPPTPIF PISDDVGAIP IKHFPKHVAD LHASSGFTEE FETLKEFYQE VQSCTVDLGI 1740 

TADSSNHPDN KHKNRYINIV AYDHSRVKLA QLAEKDGKLT DYINANYVDG YNRPKAYIAA 1800 

QGPLKSTAED FWRMIWEHNV EVIVMITNLV EKGRRKCDQY WPADGSEEYG NFLVTQKSVQ 1860 

VLAYYTVRNF TLRNTKIKKG SQKGRPSGRV VTQYHYTQWP DMGVPEYSLP VliTFVRKAAY 1920 

AKRHAVGPW VHCSAGVGRT GTYIVLDSML QQIQHEGTVN IFGFLKHIRS QRNYLVQTEE 1980 

QYVFIHDTLV EAILSKETEV LDSHIHAYVN ALLIPGPAGK TKLEKQFQLL SQSNIQQSDY 2040 

SAALKQCNRE KNRTSSIIPV ERSRVGISSL SGEGTDYINA SYIKGYYQSN EFIITQHPLL 2100 

HTIKDFWRMI WDHNAQLWM IPDGQNMAED EFVYWPNXDE PINCESFKVT LMAEEHKCLS 2160 

NEEKLIIQDF ILEATQDDYV LEVRHFQCPK WPNPDSPISK TFELISVIKE EAANRDGPMI 2220 

VHDEHGGVTA GTFCALTTLM HQLEKENSVD VYQVAKMINL MRPGVFADIE QYQFLYKVIL 2280 
SLVSTRQEEM PSTSLDSNGA ALPDGNIAES LESLV 

Seq ID NO: 574 DNA sequence 

Nucleic Acid Accession #i Eos sequence 

Coding sequence: 148-4518 

1 11 21 31 41 51 

I I I I I I 

CACACATACG CACGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 120 

CGGCGAGGGG CCGCAGACCG TCTGGAAATG CGAATCCTAA AGCGTTTCCT CGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACCGT 480 

GTCAGGGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGCGGAC CGATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG CGATTATTGA TGGAGTCGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 90 0 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTCGAGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGCGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TCGCATAGGG 1560 

ACGAAATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 
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GTGTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTT6GATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATACAATG CAGAGGCCAG TAATAGTAGC CATGAGTCTC GTATTGGTCT AGCTGAGGGG 2460 

TTGGAATCCG AGAAGAAGGC AGTTATACCC CTTGTGATCG TGTCAGCCCT GACTTTTATC 2520 

TGTCTAGTGG TTCTTGTGGG TATTCTCATC TACTGGAGGA AATGCTTCCA GACTGCACAC 2580 

TTTTACTTAG AGGACAGTAC ATCCCCTAGA GTTATATCCA CACCTCCAAC ACCTATCTTT 2640 

CCAATTTCAG ATGATGTCGG AGCAATTCCA ATAAAGCACT TTCCAAAGCA TGTTGCAGAT 2700 

TTACATGCAA GTAGTGGGTT TACTGAAGAA TTTGAGACAC TGAAAGAGTT TTACCAGGAA 2760 

GTGCAGAGCT GTACTGTTGA CTTAGGTATT ACAGCAGACA GCTCCAACCA CCCAGACAAC 2820 

AAGCACAAGA ATCGATACAT AAATATCGTT GCCTATGATC ATAGCAGGGT TAAGCTAGCA 2880 

CAGCTTGCTG AAAAGGATGG CAAACTGACT GATTATATCA ATGCCAATTA TGTTGATGGC 2940 

TACAACAGAC CAAAAGCTTA TATTGCTGCC CAAGGCCCAC TGAAATCCAC AGCTGAAGAT 3000 

TTCTGGAGAA TGATATGGGA ACATAATGTG GAAGTTATTG TCATGATAAC AAACCTCGTG 3060 

GAGAAAGGAA GGAGAAAATG TGATCAGTAC TGGCCTGCCG ATGGGAGTGA GGAGTACGGG 3120 

AACTTTCTGG TCACTCAGAA GAGTGTGCAA GTGCTTGCCT ATTATACTGT GAGGAATTTT 3180 

ACTCTAAGAA ACACAAAAAT AAAAAAGGGC TCOCAGAAAG GAAGACCCAG TGGACGTGTG 3240 

GTCACACAGT ATCACTACAC GCAGTGGCCT GACATGGGAG TACCAGAGTA CTCCCTGCCA 3300 

GTGCTGACCT TTGTGAGAAA GGCAGCCTAT GCCAAGCGCC ATGCAGTGGG GCCTGTTGTC 3360 

GTCCACTGCA GTGCTGGAGT TGGAAGAACA GGCACATATA TTGTGCTAGA CAGTATGTTG 3420 

CAGCAGATTC AACACGAAGG AACTGTCAAC ATATTTGGCT TCTTAAAACA CATCCGTTCA 3480 

CAAAGAAATT ATTTGGTACA AACTGAGGAG CAATATGTCT TCATTCATGA TACACTGGTT 3540 

GAGGCCATAC TTAGTAAAGA AACTGAGGTG CTGGACAGTC ATATTCATGC CTATGTTAAT 3600 

GCACTCCTCA TTCCTGGACC AGCAGGCAAA ACAAAGCTAG AGAAACAATT CCAGCTCCTG 3660 

AGCCAGTCAA ATATACAGCA GAGTGACTAT TCTGCAGCCC TAAAGCAATG CAACAGGGAA 3720 

AAGAATCGAA CTTCTTCTAT CATCCCTGTG GAAAGATCAA GGGTTGGCAT TTCATCCCTG 3780 

AGTGGAGAAG GCACAGACTA CATCAATGCC TCCTATATCA TGGGCTATTA CCAGAGCAAT 3840 

GAATTCATCA TTACCCAGCA CCCTCTCCTT CATACCATCA AGGATTTCTG GAGGATGATA 3900 

TGGGACCATA ATGCCCAACT GGTGGTTATG ATTCCTGATG GCCAAAACAT GGCAGAAGAT 3960 

GAATTTGTTT ACTGGCCAAA TAAAGATGAG CCTATAAATT GTGAGAGCTT TAAGGTCACT 4020 

CTTATGGCTG AAGAACACAA ATGTCTATCT AATGAGGAAA AACTTATAAT TCAGGACTTT 4080 

ATCTTAGAAG CTACACAGGA TGATTATGTA CTTGAAGTGA GGCACTTTCA GTGTCCTAAA 4140 

TGGCCAAATC CAGATAGCCC CATTAGTAAA ACTTTTGAAC TTATAAGTGT TATAAAAGAA 4200 

GAAGCTGCCA ATAGGGATGG GCCTATGATT GTTCATGATG AGCATGGAGG AGTGACGGCA 4260 

GGAACTTTCT GTGCTCTGAC AACCCTTATG CACCAACTAG AAAAAGAAAA TTCCGTGGAT 4320 

GTTTACCAGG TAGCCAAGAT GATCAATCTG ATGAGGCCAG GAGTCTTTGC TGACATTGAG 4380 

CAGTATCAGT TTCTCTACAA AGTGATCCTC AGCCTTGTGA GCACAAGGCA GGAAGAGAAT 4440 

CCATCCACCT CTCTGGACAG TAATGGTGCA GCATTGCCTG ATGGAAATAT AGCTGAGAGC 4500 

TTAGAGTCTT TAGTTTAACA CAGAAAGGGG TGGGGGGACT CACATCTGAG CATTGTTTTC 4560 

CTCTTCCTAA AATTAGGCAG GAAAATCAGT CTAGTTCTGT TATCTGTTGA TTTCCCATCA 4620 

CCTGACAGTA ACTTTCATGA CATAGGATTC TGCCGCCAAA TTTATATCAT TAACAATGTG 4680 

TGCCTTTTTG CAAGACTTGT AATTTACTTA TTATGTTTGA ACTAAAATGA TTGAATTTTA 4740 

CAGTATTTCT AAGAATGGAA TTGTGGTATT TTTTTCTGTA TTGATTTTAA CAGAAAATTT 4800 

CAATTTATAG AGGTTAGGAA TTCCAAACTA CAGAAAATGT TTGTTTTTAG TGTCAAATTT 4860 

TTAGCTGTAT TTGTAGCAAT TATCAGGTTT GCTAGAAATA TAACTTTTAA TACAGTAGCC 4920 

TGTAAATAAA ACACTCTTCC ATATGATATT CAACATTTTA CAACTGCAGT ATTCACCTAA 4980 

AGTAGAAATA ATCTGTTACT TATTGTAAAT ACTGCCCTAG TGTCTCCATG GACCAAATTT 5040 

ATATTTATAA TTGTAGATTT TTATATTTTA CTACTGAGTC AAGTTTTCTA GTTCTGTGTA 5100 

ATTGTTTAGT TTAATGACGT AGTTCATTAG CTGGTCTTAC TCTACCAGTT TTCTGACATT 5160 

GTATTGTGTT ACCTAAGTCA TTAACTTTGT TTCAGCATGT AATTTTAACT TTTGTGGAAA 5220 

ATAGAAATAC CTTCATTTTG AAAGAAGTTT TTATGAGAAT AACACCTTAC CAAACATTGT 5280 

TCAAATGGTT TTTATCCAAG GAATTGCAAA AATAAATATA AATATTGCCA TTAAAAAAAA 5340 
AAAAAAAAAA AAAAAAAAAA AAAAAAA 



Seq ID NO: 575 Protein sequence: 
Protein Accession ftt Eos sequence 

1 11 21 31 41 51 

I I I I I I 

MRILKRFLAC IQLLCVCRLD WANGYYRQQR KLVEEIGWSY TGALNQKNWG KKYPTCNSPK 60 

QSPINIDEDL TQVNVNLKKL KFQGWDKTSL ENTFIHNTGK TVEINLTNDY RVSGGVSEMV 120 

FKASKITFHW GKCNMSSDGS EHSLEGQKFP LEMQIYCFDA DRFSSFEEAV KGKGKLRALS 180 

ILPEVGTEEN LDFKAI IDGV ESVSRFGKQA ALDPFILLNL LPNSTDKYYI YNGSLTSPPC 240 

TDTVDWIVFK DTVSISESQL AVFCEVLTMQ QSGYVMLMDY LQNNFREQQY KFSRQVFSSY 300 

TGKEEIHEAV CSSEPENVOA DPENYTSLLV TWERPRWYD TMIEKFAVLY QQLDGEDQTK 360 

HEFLTDGYQD LGAILNNLLP NMSYVLQIVA ICTNGLYGKY SDQLIVDMPT DNPELDLFPB 420 

LIGTEEIIKE EEEGKDIEEG AIVNPGRDSA TNQIRKKEPQ ISTTTHYNRI GTKYNEAKTN 480 

RSPTRGSEFS GKGDVPNTSL NSTSQPVTKL ATEKDISLTS QTVTELPPHT VEGTSASLND 540 

GSKTVLRSPH MNLSGTAESL NTVSITEYEE ESLLTSFKLD TGAEDSSGSS PATSAIPFIS 600 

ENISQGYIFS SENPETITYD VLIPESARNA SEDSTSSGSE ESLKDPSMEG NVWFPSSTDI 660 

TAQPDVGSGR ESFLQTNYTE IRVDESEKTT KSFSAGPVMS QGPSVTDLEM PHYSTFAYFP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNAEASNS SHESRIGLAE GLESEKKAVI 780 

PLVIVSALTF ICLWLVGIL IYWRKCFQTA HFYLEDSTSP RVISTPPTPI FPISDDVGAI 840 

PIKHFPKHVA DLHASSGFTE EPETLKEFYQ EVQSCTVDLG ITADSSNHPD NKHKNRYINI 900 

VAYDHSRVKL AQLAEKDGKL TDYINANYVD GYNRPKAYIA AQGPLKSTAB DFWRMIWEHN 960 

VEVIVMITNL VEKGRRKCDQ YWPADGSEEY GNFLVTQKSV QVLAYYTVRN FTLRNTKIKK 1020 

GSQKGRPSGR WTQYHYTQW PDMGVPEYSL PVLTFVRKAA YAKRHAVGPV WHCSAGVGR 1080 

TGTYIVLDSM LQQIQHEGTV NIPGFLKHIR SQRNYLVQTE EQYVFIHDTL VEAILSKETE 1140 

VLDSHIHAYV NALLIPGPAG KTKLEKQFQL LSQSNIQQSD YSAALKQCNR EKNRTSSIIP 1200 

VERSRVGISS LSGEGTDYIN ASYIMGYYQS NEFIITOHPI* LHTIKDFWRM IWDHNAQLW 1260 

MIPDGQNMAE DEFVYWPNKD EPINCESFKV TLMAEEHKCL SNEEKLI IQD FILEATQDDY 1320 

VLEVRHFQCP KWPNPDSPIS KTFELISVIK EEAANRDGPM IVHDEHGGVT AGTFCALTTL 1380 

MHQLEKENSV DVYQVAKMIN LMRPGVFADX EQYQFLYKVI LSLVSTRQEE NPSTSLDSNG 1440 
AALPDGNIAE SLESLV 
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Seq ID NO i S76 DNA sequence 

Nucleic Acid Accession #: BOS sequence 

Coding sequence: 148-4494 

1 11 21 31 41 51 

I I I I I I 

CACACATACG CACGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 120 

CGGCGAGGGG CCGCAGACCG TCTGGAAATG CGAATCCTAA AGCGTTTCCT CGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACCGT 480 

GTCAGCGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGCAGAC CGATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG CGATTATTGA TGGAGTCGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTCGAGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGCGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TCGCATAGGG 1560 

ACGAAATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATACAATG AGGCCAGTAA TAGTAGCCAT GAGTCTCGTA TTGGTCTAGC TGAGGGGTTG 2460 

GAATCCGAGA AGAAGGCAGT TATACCCCTT GTGATCGTGT CAGCCCTGAC TTTTATCTGT 2520 

CTAGTGGTTC TTGTGGGTAT TCTCATCTAC TGGAGGAAAT GCTTCCAGAC TGCACACTTT 2580 

TACTTAGAGG ACAGTACATC CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 2640 

ATTTCAGATG ATGTCGGAGC AATTCCAATA AAGCACTTTC CAAAGCATGT TGCAGATTTA 2700 

CATGCAAGTA GTGGGTTTAC TGAAGAATTT GAGGAAGTGC AGAGCTGTAC TGTTGACTTA 2760 

GGTATTACAG CAGACAGCTC CAACCACCCA GACAACAAGC ACAAGAATCG ATACATAAAT 2820 

ATCGTTGCCT ATGATCATAG CAGGGTTAAG CTAGCACAGC TTGCTGAAAA GGATGGCAAA 2880 

CTGACTGATT ATATCAATGC CAATTATGTT GATGGCTACA ACAGACCAAA AGCTTATATT 2940 

GCTGCCCAAG GCCCACTGAA ATCCACAGCT GAAGATTTCT GGAGAATGAT ATGGGAACAT 3000 

AATGTGGAAG TTATTGTCAT GATAACAAAC CTCGTGGAGA AAGGAAGGAG AAAATGTGAT 3060 

CAGTACTGGC CTGCCGATGG GAGTGAGGAG TACGGGAACT TTCTGGTCAC TCAGAAGAGT 3120 

GTGCAAGTGC TTGCCTATTA TACTGTGAGG AATTTTACTC TAAGAAACAC AAAAATAAAA 3180 

AAGGGCTCCC AGAAAGGAAG ACCCAGTGGA CGTGTGGTCA CACAGTATCA CTACACGCAG 3240 

TGGCCTGACA TGGGAGTACC AGAGTACTCC CTGCCAGTGC TGACCTTTGT GAGAAAGGCA 3300 

GCCTATGCCA AGCGCCATGC AGTGGGGCCT GTTGTCGTCC ACTGCAGTGC TGGAGTTGGA 3360 

AGAACAGGCA CATATATTGT GCTAGACAGT ATGTTGCAGC AGATTCAACA CGAAGGAACT 3420 

GTCAACATAT TTGGCTTCTT AAAACACATC CGTTCACAAA GAAATTATTT GGTACAAACT 3480 

GAGGAGCAAT ATGTCTTCAT TCATGATACA CTGGTTGAGG CCATACTTAG TAAAGAAACT 3540 

GAGGTGCTGG ACAGTCATAT TCATGCCTAT GTTAATGCAC TCCTCATTCC TGGACCAGCA 3600 

GGCAAAACAA AGCTAGAGAA ACAATTCCAG CTCCTGAGCC AGTCAAATAT ACAGCAGAGT 3660 

GACTATTCTG CAGCCCTAAA GCAATGCAAC AGGGAAAAGA ATCGAACTTC TTCTATCATC 3720 

CCTGTGGAAA GATCAAGGGT TGGCATTTCA TCCCTGAGTG GAGAAGGCAC AGACTACATC 3780 

AATGCCTCCT ATATCATGGG CTATTACCAG AGCAATGAAT TCATCATTAC CCAGCACCCT 3840 

CTCCTTCATA CCATCAAGGA TTTCTGGAGG ATGATATGGG ACCATAATGC CCAACTGGTG 3900 

GTTATGATTC CTGATGGCCA AAACATGGCA GAAGATGAAT TTGTTTACTG GCCAAATAAA 3960 

GATGAGCCTA TAAATTGTGA GAGCTTTAAG GTCACTCTTA TGGCTGAAGA ACACAAATGT 4020 

CTATCTAATG AGGAAAAACT TATAATTCAG GACTTTATCT TAGAAGCTAC ACAGGATGAT 4080 

TATGTACTTG AAGTGAGGCA CTTTCAGTGT CCTAAATGGC CAAATCCAGA TAGCCCCATT 4140 

AGTAAAACTT TTGAACTTAT AAGTGTTATA AAAGAAGAAG CTGCCAATAG GGATGGGCCT 4200 

ATGATTGTTC ATGATGAGCA TGGAGGAGTG ACGGCAGGAA CTTTCTGTGC TCTGACAACC 4260 

CTTATGCACC AACTAGAAAA AGAAAATTCC GTGGATGTTT ACCAGGTAGC CAAGATGATC 4320 

AATCTGATGA GGCCAGGAGT CTTTGCTGAC ATTGAGCAGT ATCAGTTTCT CTACAAAGTG 4380 

ATCCTCAGCC TTGTGAGCAC AAGGCAGGAA GAGAATCCAT CCACCTCTCT GGACAGTAAT 4440 

GGTGCAGCAT TGCCTGATGG AAATATAGCT GAGAGCTTAG AGTCTTTAGT TTAACACAGA 4500 

AAGGGGTGGG GGGACTCACA TCTGAGCATT GTTTTCCTCT TCCTAAAATT AGGCAGGAAA 4560 

ATCAGTCTAG TTCTGTTATC TGTTGATTTC CCATCACCTG ACAGTAACTT TCATGACATA 4620 

GGATTCTGCC GCCAAATTTA TATCATTAAC AATGTGTGCC TTTTTGCAAG ACTTGTAATT 4680 

TACTTATTAT GTTTGAACTA AAATGATTGA ATTTTACAGT ATTTCTAAGA ATGGAATTGT 4740 

GGTATTTTTT TCTGTATTGA TTTTAACAGA AAATTTCAAT TTATAGAGGT TAGGAATTCC 4800 
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AAACTACAGA AAATGTTTGT TTTTAGTGTC AAATTTTTAG CTGTATTTGT AGCAATTATC 4860 

AGGTTTGCTA GAAATATAAC TTTTAATACA GTAGCCTGTA AATAAAACAC TCTTCCATAT 4920 

GATATTCAAC ATTTTACAAC TGCAGTATTC ACCTAAAGTA GAAATAATCT GTTACTTATT 4980 

GTAAATACTG CCCTAGTGTC TCCATGGACC AAATTTATAT TTATAATTGT AGATTTTTAT 5040 

ATTTTACTAC TGAGTCAAGT TTTCTAGTTC TGTGTAATTG TTTAGTTTAA TGACGTAGTT 5100 

CATTAGCTGG TCTTACTCTA CCAGTTTTCT GACATTGTAT TGTGTTACCT AAGTCATTAA 5160 

CTTTGTTTCA GCATGTAATT TTAACTTTTG TGGAAAATAG AAATACCTTC ATTTTGAAAG 5220 

AAGTTTTTAT GAGAATAACA CCTTACCAAA CATTGTTCAA ATGGTTTTTA TCCAAGGAAT 5280 

TGCAAAAATA AATATAAATA TTGCCATTAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 5340 
AAA 

Seq ID NO: 577 Protein sequence: 
Protein Accession §: EOS sequence 

1 11 21 31 41 51 

I I I I I I 

MRILKRFLAC IQLLCVCRLD WANGYYRQQR KLVEEIGWSY TGALNQKNWG KKYPTCNSPK 60 

QSPINIDEDL TQVNVNLKKL KFQGWDKTSL ENTFIHNTGK TVEINLTNDY RVSGGVSEMV 120 

FKASKITFHW GKCNMSSDGS EHSLEGQKFP LEMQIYCFDA DRFSSFEEAV KGKGKLRALS 180 

ILFEVGTEEN LDFKAIIDGV ESVSRFGKQA ALDPFILLNL LPNSTDKYYI YNGSLTSPPC 240 

TDTVDWIVFK DTVSISESQL AVFCEVLTMQ QSGYVMLMDY LQNNFREQQY KFSRQVFSSY 300 

TGKEEIHEAV CSSEPENVQA DPENYTSLLV TWER PR WYE TMIEKFAVLY QQLDGEDQTK 360 

HEFLTDGYQD LGAILNNLLP NMSYVLQIVA ICTNGLYGKY SDQLIVDMPT DNPELDLFPE 420 

LIGTEEIIKE EEEGKDIEEG AIVNPGRDSA TNQIRKKEPQ ISTTTHYNRI GTKYNEAKTN 480 

RSPTRGSEFS GKGDVPNTSL NSTSQPVTKL ATEKDI SLTS QTVTELPPHT VEGTSASLND 540 

GSKTVLRSPH MNLSGTAESL NTVSITEYEE ESLI*TSFKLD TGAEDSSGSS PATSAIPFIS 600 

ENISQGYIFS SENPETITYD VLIPESARNA SEDSTSSGSE ESLKDPSMEG NVWFFSSTDI 660 

TAQPDVGSGR ESFLQTNYTE IRVDESEKTT KSFSAGPVMS QGPSVTDLEM PHYSTFAYFP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNEASNSS HESRXGLAEG LESEKKAVIP 780 

DVIVSALTFI CLWLVGILI YWRKCFQTAH FYLEDSTSPR VISTPPTPIF PISDDVGAIP 840 

IKHFPKHVAD LHASSGFTEE FEEVQSCTVD LGITADSSNH PDNKHKNRYI NIVAYDHSRV 900 

KLAQLAEKDG KLTDYINANY VDGYNRPKAY IAAQGPLKST AEDFWRMIWE HNVEVIVM1T 960 

NLVEKGRRKC DQYWPADGSE EYGNFLVTQK SVQVLAYYTV RNFTLRNTKI KKGSQKGRPS 1020 

GRWTQYHYT QWPDMGVPEY SLPVLTFVRK AAYAKRHAVG PWVHCSAGV GRTGTYIVLD 1080 

SMLQQIQHEG TVNIFGFLKH IRSQRNYLVQ TEEQYVFIHD TLVEAILSKE TEVLDSHIHA 1140 

YVNALLIPGP AGKTKLEKQF QLLSQSNIQQ SDYSAALKQC NREKNRTSSI IPVERSRVGI 1200 

SSLSGEGTDY INASYIMGYY QSNEFIITQH PLLHTIKDFW RMIWDHNAQL WMIPDGQNM 1260 

AEDEFVYWPN KDEPINCESF KVTLMAEEHK CLSNEBKIiII QDFILEATQD DYVLEVRHFQ 1320 

CPKWPNPDSP ISKTFELISV IKEEAANRDG PMIVHDEHGG VTAGTFCALT TLMHQLEKEN 1380 

SVDVYQVAKM INLMRPGVFA DIEQYQFLYK VILSLVSTRQ EENPSTSLDS NGAALPDGNI 1440 
AESLEStiV 



Seq ID NO: 578 DNA sequence 

Nucleic Acid Accession ft: EOS sequence 

Coding sequence: 501-4514 

1 11 21 31 41 51 

I I I I 1 I 

CACACATACG CACGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 120 

CGGCGAGGGG CCGCAGACCG TCTGGAAATG CGAATCCTAA AGCGTTTCCT CGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAT TGGGGAAAGA 300 

AATATCCAAC ATGTAATAGC CCAAAACAAT CTCCTATCAA TATTGATGAA GATCTTACAC 360 

AAGTAAATGT GAATCTTAAG AAACTTAAAT TTCAGGGTTG GGATAAAACA TCATTGGAAA 420 

ACACATTCAT TCATAACACT GGGAAAACAG TGGAAATTAA TCTCACTAAT GACTACCGTG 480 

TCAGCGGAGG AGTTTCAGAA ATGGTGTTTA AAGCAAGCAA GATAACTTTT CACTGGGGAA 540 

AATGCAATAt GTCATCTGAT GGATCAGAGC ATAGTTTAGA AGGACAAAAA TTTCCACTTG 600 

AGATGCAAAT CTACTGCTTT GATGCGGACC GATTTTCAAG TTTTGAGGAA GCAGTCAAAG 660 

GAAAAGGGAA GTTAAGAGCT TTATCCATTT TGTTTGAGGT TGGGACAGAA GAAAATTTGG 720 

ATTTCAAAGC GATTATTGAT GGAGTCGAAA GTGTTAGTCG TTTTGGGAAG CAGGCTGCTT 780 

TAGATCCATT CATACTGTTG AACCTTCTGC CAAACTCAAC TGACAAGTAT TACATTTACA 840 

ATGGCTCATT GACATCTCCT CCCTGCACAG ACACAGTTGA CTGGATTGTT TTTAAAGATA 900 

CAGTTAGCAT CTCTGAAAGC CAGTTGGCTG TTTTTTGTGA AGTTCTTACA ATGCAACAAT 960 

CTGGTTATGT CATGCTGATG GACTACTTAC AAAACAATTT TCGAGAGCAA CAGTACAAGT 1020 

TCTCTAGACA GGTGTTTTCC TCATACACTG GAAAGGAAGA GATTCATGAA GCAGTTTGTA 1080 

GTTCAGAACC AGAAAATGTT CAGGCTGACC CAGAGAATTA TACCAGCCTT CTTGTTACAT 1140 

GGGAAAGACC TCGAGTCGTT TATGATACCA TGATTGAGAA GTTTGCAGTT TTGTACCAGC 1200 

AGTTGGATGG AGAGGACCAA ACCAAGCATG AATTTTTGAC AGATGGCTAT CAAGACTTGG 1260 

GTGCTATTCT CAATAATTTG CTACCCAATA TGAGTTATGT TCTTCAGATA GTAGCCATAT 1320 

GCACTAATGG CTTATATGGA AAATACAGCG ACCAACTGAT TGTCGACATG CCTACTGATA 1380 

ATCCTGAACT TGATCTTTTC CCTGAATTAA TTGGAACTGA AGAAATAATC AAGGAGGAGG 1440 

AAGAGGGAAA AGACATTGAA GAAGGCGCTA TTGTGAATCC TGGTAGAGAC AGTGCTACAA 1500 

ACCAAATCAG GAAAAAGGAA CCCCAGATTT CTACCACAAC ACACTACAAT CGCATAGGGA 1560 

CGAAATACAA TGAAGCCAAG ACTAACCGAT CCCCAACAAG AGGAAGTGAA TTCTCTGGAA 1620 

AGGGTGATGT TCCCAATACA TCTTTAAATT CCACTTCCCA ACCAGTCACT AAATTAGCCA 1680 

CAGAAAAAGA TATTTCCTTG ACTTCTCAGA CTGTGACTGA ACTGCCACCT CACACTGTGG 1740 

AAGGTACTTC AGCCTCTTTA AATGATGGCT CTAAAACTGT TCTTAGATCT CCACATATGA 1800 

ACTTGTCGGG GACTGCAGAA TCCTTAAATA CAGTTTCTAT AACAGAATAT GAGGAGGAGA 1860 

GTTTATTGAC CAGTTTCAAG CTTGATACTG GAGCTGAAGA TTCTTCAGGC TCCAGTCCCG 1920 

CAACTTCTGC TATCCCATTC ATCTCTGAGA ACATATCCCA AGGGTATATA TTTTCCTCCG 1980 

AAAACCCAGA GACAATAACA TATGATGTCC TTATACCAGA ATCTGCTAGA AATGCTTCCG 2040 

AAGATTCAAC TTCATCAGGT TCAGAAGAAT CACTAAAGGA TCCTTCTATG GAGGGAAATG 2100 

TGTGGTTTCC TAGCTCTACA GACATAACAG CACAGCCCGA TGTTGGATCA GGCAGAGAGA 2160 

GCTTTCTCCA GACTAATTAC ACTGAGATAC GTGTTGATGA ATCTGAGAAG ACAACCAAGT 2220 

CCTTTTCTGC AGGCCCAGTG ATGTCACAGG GTCCCTCAGT TACAGATCTG GAAATGCCAC 2280 
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ATTATTCTAC CTTTGCCTAC TTCCCAACTG AGGTAACACC TCATGCTTTT ACCCCATCCT 2340 

CCAGACAACA GGATTTGGTC TCCACGGTCA ACGTGGTATA CTCGCAGACA ACCCAACCGG 2400 

TATACAATGA GGCCAGTAAT AGTAGCCATG AGTCTCGTAT TGGTCTAGCT GAGGGGTTGG 2460 

AATCCGAGAA GAAGGCAGTT ATACCCCTTG TGATCGTGTC AGCCCTGACT TTTAT CTGTC 2520 

TAGTGGTTCT TGTGGGTATT CTCATCTACT GGAGGAAATG CTTCCAGACT GCACACTTTT 2580 

ACTTAGAGGA CAGTACATCC CCTAGAGTTA TATCCACACC TCCAACACCT ATCTTTCCAA 2640 

TTTCAGATGA TGTCGGAGCA ATTCCAATAA AGCACTTTCC AAAGCATGTT GCAGATTTAC 2700 

ATGCAAGTAG TGGGTTTACT GAAGAATTTG AGACACTGAA AGAGTTTTAC CAGGAAGTGC 2760 

AGAGCTGTAC TGTTGACTTA GGTATTACAG CAGACAGCTC CAACCACCCA GACAACAAGC 2820 

ACAAGAATCG ATACATAAAT ATCGTTGCCT ATGATCATAG CAGGGTTAAG CTAGCACAGC 2880 

TTGCTGAAAA GGATGGCAAA CTGACTGATT ATATCAATGC CAATTATGTT GATGGCTACA 2940 

ACAGACCAAA AGCTTATATT GCTGCCCAAG GCCCACTGAA ATCCACAGCT GAAGATTTCT 3000 

GGAGAATGAT ATGGGAACAT AATGTGGAAG TTATTGTCAT GATAACAAAC CTCGTGGAGA 3060 

AAGGAAGGAG AAAATGTGAT CAGTACTGGC CTGCCGATGG GAGTGAGGAG TACGGGAACT 3120 

TTCTGGTCAC TCAGAAGAGT GTGCAAGTGC TTGCCTATTA TACTGTGAGG AATTTTACTC 3180 

TAAGAAACAC AAAAATAAAA AAGGGCTCCC AGAAAGGAAG ACCCAGTGGA CGTGTGGTCA 3240 

CACAGTATCA CTACACGCAG TGGCCTGACA TGGGAGTACC AGAGTACTCC CTGCCAGTGC 3300 

TGACCTTTGT GAGAAAGGCA GCCTATGCCA AGCGCCATGC AGTGGGGCCT GTTGTCGTCC 3360 

ACTGCAGTGC TGGAGTTGGA AGAACAGGCA CATATATTGT GCTAGACAGT ATGTTGCAGC 3420 

AGATTCAACA CGAAGGAACT GTCAACATAT TTGGCTTCTT AAAACACATC CGTTCACAAA 3480 

GAAATTATTT GGTACAAACT GAGGAGCAAT ATGTCTTCAT TCATGATACA CTGGTTGAGG 3540 

CCATACTTAG TAAAGAAACT GAGGTGCTGG ACAGTCATAT TCATGCCTAT GTTAATGCAC 3600 

TCCTCATTCC TGGACCAGCA GGCAAAACAA AGCTAGAGAA ACAATTCCAG CTCCTGAGCC 3660 

AGTCAAATAT ACAGCAGAGT GACTATTCTG CAGCCCTAAA GCAATGCAAC AGGGAAAAGA 3720 

ATCGAACTTC TTCTATCATC CCTGTGGAAA GATCAAGGGT TGGCATTTCA TCCCTGAGTG 3780 

GAGAAGGCAC AGACTACATC AATGCCTCCT ATATCATGGG CTATTACCAG AGCAATGAAT 3840 

- TCATCATTAC CCAGCACCCT CTCCTTCATA CCATCAAGGA TTTCTGGAGG ATGATATGGG 3900 

ACCATAATGC CCAACTGGTG GTTATGATTC CTGATGGCCA AAACATGGCA GAAGATGAAT 3960 

TTGTTTACTG GCCAAATAAA GATGAGCCTA TAAATTGTGA GAGCTTTAAG GTCACTCTTA 4020 

TGGCTGAAGA ACACAAATGT CTATCTAATG AGGAAAAACT TATAATTCAG GACTTTATCT 4080 

TAGAAGCTAC ACAGGATGAT TATGTACTTG AAGTGAGGCA CTTTCAGTGT CCTAAATGGC 4140 

CAAATCCAGA TAGCCCCATT AGTAAAACTT TTGAACTTAT AAGTGTTATA AAAGAAGAAG 4200 

CTGCCAATAG GGATGGGCCT ATGATTGTTC ATGATGAGCA TGGAGGAGTG ACGGCAGGAA 4260 

CTTTCTGTGC TCTGACAACC CTTATGCACC AACTAGAAAA AGAAAATTCC GTGGATGTTT 4320 

ACCAGGTAGC CAAGATGATC AATCTGATGA GGCCAGGAGT CTTTGCTGAC ATTGAGCAGT 4380 

ATCAGTTTCT CTACAAAGTG ATCCTCAGCC TTGTGAGCAC AAGGCAGGAA GAGAATCCAT 4440 

CCACCTCTCT GGACAGTAAT GGTGCAGCAT TGCCTGATGG AAATATAGCT GAGAGCTTAG 4500 

AGTCTTTAGT TTAACACAGA AAGGGGTGGG GGGACTCACA TCTGAGCATT GTTTTCCTCT 4560 

TCCTAAAATT AGGCAGGAAA ATCAGTCTAG TTCTGTTATC TGTTGATTTC CCATCACCTG 4620 

ACAGTAACTT TCATGACATA GGATTCTGCC GCCAAATTTA TATCATTAAC A ATGT GTGCC 4680 

TTTTTGCAAG ACTTGTAATT TACTTATTAT GTTTGAACTA AAATGATTGA ATTTTACAGT 4740 

ATTTCTAAGA ATGGAATTGT GGTATTTTTT TCTGTATTGA TTTTAACAGA AAATTTCAAT 4800 

TTATAGAGGT TAGGAATTCC AAACTACAGA AAATGTTTGT TTTTAGTGTC AAATTTTTAG 4860 

CTGTATTTGT AGCAATTATC AGGTTTGCTA GAAATATAAC TTTTAATACA GTAGCCTGTA 4920 

AATAAAACAC TCTTCCATAT GATATTCAAC ATTTTACAAC TGCAGTATTC ACCTAAAGTA 4980 

GAAATAATCT GTTACTTATT GTAAATACTG CCCTAGTGTC TCCATGGACC AAATTTATAT 5040 

TTATAATTGT AGATTTTTAT ATTTTACTAC TGAGTCAAGT TTTCTAGTTC TGTGTAATTG 5100 

TTTAGTTTAA TGACGTAGTT CATTAGCTGG TCTTACTCTA CCAGTTTTCT GACATTGTAT 5160 

TGTGTTACCT AAGTCATTAA CTTTGTTTCA GCATGTAATT TTAACTTTTG TGGAAAATAG 5220 

AAATACCTTC ATTTTGAAAG AAGTTTTTAT GAGAATAACA CCTTACCAAA CATTGTTCAA 5280 

ATGGTTTTTA TCCAAGGAAT TGCAAAAATA AATATAAATA TTGCCATTAA AAAAAAAAAA 5340 
AAAAAAAAAA AAAAAAAAAA AAA 

Seq ID NO: 579 Protein sequence: 
Protein Accession #: EOS sequence 

1 11 21 31 41 51 

I I I I I I 

MVPKASKITF HWGKCNMSSD GSEHSLEGQK FPLEMQIYCF DADRFSSFEE AVKGKGKLRA 60 

LSIIiFEVGTE ENLDFKAIID GVESVSRFGK QAALDPFILL NliLPNSTDKY YIYNGSLTSP 120 

PCTDTVDWIV FKDTVSISES QLAVFCEVLT MQQSGYVMLM DYLQNNFREQ QYKFSRQVFS 180 

SYTGKEEIHE AVCSSEPENV QADPENYTSL LVTWERPRW YDTMIEKFAV LYQQLDGEDQ 240 

TKHEFLTDGY QDLGAILNNL LPNMSYVLQI VAICTNGLYG KYSDQLIVDM PTDNPELDLF 300 

PELIGTEEII KEEEEGKDIE EGAIVNPGRD SATNQIRKKE PQISTTTRYN RIGTKYNEAK 360 

TNRSPTRGSE FSGKGDVPNT SLNSTSQPVT KLATEKDISL TSQTVTELPP HTVEGTSASL 420 

NDGSKTVLRS PHMNLSGTAE SLNTVSITEY EEESLLTSFK LDTGAEDSSG SSPATSAIPF 480 

ISENISQGYI FSSEMPETIT YDVLIPESAR NASEDSTSSG SEESLKDPSM EGNVWFPSST 540 

DITAQPDVGS GRESPLQTNY TEIRVDESEK TTKSFSAGPV MSQGPSVTDL EMPHYSTPAY 600 

FPTEVTPHAF TPSSRQQDLV STVNWYSQT TQPVYNBASN SSHESRIGLA EGLESEKKAV 660 

IPLVIVSALT FICLWLVGI LIYWRKCFQT AHFYLEDSTS PRVISTPPTP IFPISDDVGA 720 

IPIKHFPKHV ADLHASSGFT EEFETLKEFY QEVQSCTVDL GITADSSNHP DNKHKNRYIN 780 

IVAYDHSRVK LAQLAEKDGK LTDYINANYV DGYNRPKAYI AAQGPLKSTA EDFWRMIWEH 840 

NVEVIVMITN LVEKGRRKCD QYWPADGSEE YGNFLVTQKS VQVLAYYTVR NFTLRNTKIK 900 

KGSQKGRPSG RWTQYHYTQ WPDMGVPEYS LPVLTFVRKA AYAKRHAVGP VWHCSAGVG 960 

RTGTYIVLDS MLQQIQHEGT VNIFGFLKHI RSQRNYLVQT .EEQYVFIHDT LVEAILSKET 1020 

EVLDSHIHAY VNALLIPGPA GKTKLEKQFQ LLSQSNIQQS DYSAALKQCN REKNRTSSII 1080 

PVERSRVGIS SLSGEGTDYI NASYIMGYYQ SNEFIITQHP LLHTIKDFWR MIWDHNAQLV 1140 

VMIPDGQNMA EDEFVYWPNtf DEPINCESFK VTLMAEEHKC LSNEEKLIIQ DFILEATQDD 1200 

YVLEVRHFQC PKWPNPDSPI SKTFELISVI KEEAANRDGP MIVHDEHGGV TAGTFCALTT 1260 

LMHQLEKENS VDVYQVAKMI NLMRPGVFAD IEQYQFLYKV ILSLVSTRQE ENPSTSLDSN 1320 
GAALPDGNIA ESLESLV 

Seq ID NO: 580 DNA sequence 

Nucleic Acid Accession ftt EOS sequence 

Coding sequence: 148-4632 

1 11 21 31 41 51 
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CACACATACG CACGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 
CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 120 
OGGCGAGGGG CCGCAGACCG TCTGGAAATG OGAATCCTAA AACGTTTCCT CGCTTGCATT 180 
CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 
CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 
AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 
CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 
AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACCGT 480 
GTCAGCGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 
AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 
GAGATGCAAA TCTACTGCTT TGATGCGGAC OGATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 
GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 
GATTTCAAAG CGATTATTGA TGGAGTCGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 
TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 
AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 
ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTCGAGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 13 BO 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGCGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TCGCATAGGG 1560 

ACGAAATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG CTCCAGTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATACAATG AGGCCAGTAA TAGTAGCCAT GAGTCTCGTA TTGGTCTAGC TGAGGGGTTG 2460 

GAATCCGAGA AGAAGGCAGT TATACCCCTT GTGATCGTGT CAGCCCTGAC TTTTATCTGT 2520 

CTAGTGGTTC TTGTGGGTAT TCTCATCTAC TGGAGGAAAT GCTTCCAGAC TGCACACTTT 2580 

TACTTAGAGG ACAGTACATC CCCTAGAGTT ATATCCACAC CTCCAACACC TATCTTTCCA 2640 

ATTTCAGATG ATGTCGGAGC AATTCCAATA AAGCACTTTC CAAAGCATGT TGCAGATTTA 2700 

CATGCAAGTA GTGGG TTTAC TGAAGAATTT GAGACACTGA AAGAGTTTTA CCAGGAAGTG 2760 

CAGAGCTGTA CTGTTGACTT AGGTATTACA GCAGACAGCT CCAACCACCC AGACAACAAG 2820 

CACAAGAATC GATACATAAA TATCGTTGCC TATGATCATA GCAGGGTTAA GCTAGCACAG 2880 

CTTGCTGAAA AGGATGGCAA ACTGACTGAT TATATCAATG CCAATTATGT TGATGGCTAC 2940 

AACAGACCAA AAGCTTATAT TGCTGCCCAA GGCCCACTGA AATCCACAGC TGAAGATTTC 3000 

TGGAGAATGA TATGGGAACA TAATGTGGAA GTTATTGTCA TGATAACAAA CCTCGTGGAG 3060 

AAAGGAAGGA GAAAATGTGA TCAGTACTGG CCTGCCGATG GGAGTGAGGA GTACGGGAAC 3120 

XTTCTGGTCA CTCAGAAGAG TGTGCAAGTG CTTGCCTATT ATACTGTGAG GAATTTTACT 3180 

CTAAGAAACA CAAAAATAAA AAAGGGCTCC CAGAAAGGAA GACCCAGTGG ACGTGTGGTC 3240 

ACACAGTATC ACTACACGCA GTGGCCTGAC ATGGGAGTAC CAGAGTACTC CCTGCCAGTG 3300 

CTGACCTTTG TGAGAAAGGC AGCCTATGCC AAGCGCCATG CAGTGGGGCC TGTTGTCGTC 3360 

CACTGCAGTG CTGGAGTTGG AAGAACAGGC ACATATATTG TGCTAGACAG TATGTTGCAG 3420 

CAGATTCAAC ACGAAGGAAC TGTCAACATA TTTGGCTTCT TAAAACACAT CCGTTCACAA 3480 

AGAAATTATT TGGTACAAAC TGAGGAGCAA TATGTCTTCA TTCATGATAC ACTGGTTGAG 3540 

GCCATACTTA G7AAAGAAAC TGAGGTGCTG GACAGTCATA TTCATGCCTA TGTTAATGCA 3600 

CTCCTCATTC CTGGACCAGC AGGCAAAACA AAGCTAGAGA AACAATTCCA GGGTCTCACT 3660 

CTGTCACCCA GGCTGGAGTG CAGAGGCACA ATCTCGGCTC ACTGCAACCT TCCTCTCCCT 3720 

GGCTTAACTG ATCCTCCTAC CTCAGCCTCC CGAGTGGCTG GGACTATACT CCTGAGCCAG 3780 

TCAAATATAC AGCAGAGTGA CTATTCTGCA GCCCTAAAGC AATGCAACAG GGAAAAGAAT 3840 

CGAACTTCTT CTATCATCCC TGTGGAAAGA TCAAGGGTTG GCATTTCATC CCTGAGTGGA 3900 

GAAGGCACAG ACTACATCAA TGCCTCCTAT ATCATGGGCT ATTACCAGAG CAATGAATTC 3960 

ATCATTACCC AGCACCCTCT CCTTCATACC ATCAAGGATT TCTGGAGGAT GATATGGGAC 4020 

CATAATGCCC AACTGGTGGT TATGATTCCT GATGGCCAAA ACATGGCAGA AGATGAATTT 4080 

GTTTACTGGC CAAATAAAGA TGAGCCTATA AATTGTGAGA GCTTTAAGGT CACTCTTATG 4140 

GCTGAAGAAC ACAAATGTCT ATCTAATGAG GAAAAACTTA TAATTCAGGA CTTTATCTTA 4200 

GAAGCTACAC AGGATGATTA TGTACTTGAA GTGAGGCACT TTCAGTGTCC TAAATGGCCA 4260 

AATCCAGATA GCCCCATTAG TAAAACTTTT GAACTTATAA GTGTTATAAA AGAAGAAGCT 4320 

GCCAATAGGG ATGGGCCTAT GATTGTTCAT GATGAGCATG GAGGAGTGAC GGCAGGAACT 4380 

TTCTGTGCTC TGACAACCCT TATGCACCAA CTAGAAAAAG AAAATTCCGT GGATGTTTAC 4440 

CAGGTAGCCA AGATGATCAA TCTGATGAGG CCAGGAGTCT TTGCTGACAT TGAGCAGTAT 4500 

CAGTTTCTCT ACAAAGTGAT CCTCAGCCTT GTGGGCACAA GGCAGGAAGA GAATCCATCC 4560 

ACCTCTCTGG ACAGTAATGG TGCAGCATTG CCTGATGGAA ATATAGCTGA GAGCTTAGAG 4620 

TCTTTAGTTT AACACAGAAA GGGGTGGGGG GACTCACATC TGAGCATTGT TTTCCTCTTC 4680 

CTAAAATTAG GCAGGAAAAT CAGTCTAGTT CTGTTATCTG TTGATTTCCC ATCACCTGAC 4740 

AGTAACTTTC ATGACATAGG ATTCTGCCGC CAAATTTATA TCATTAACAA TGTGTGCCTT 4800 

TTTGCAAGAC TTGTAATTTA CTTATTATGT TTGAACTAAA ATGATTGAAT TTTACAGTAT 4860 

TTCTAAGAAT GGAATTGTGG TATTTTTTTC TGTATTGATT TTAACAGAAA ATTTCAATTT 4920 

ATAGAGGTTA GGAATTCCAA ACTACAGAAA ATGTTTGTTT TTAGTGTCAA ATTTTTAGCT 4980 

GTATTTGTAG CAATTATCAG GTTTGCTAGA AATATAACTT TTAATACAGT AGCCTGTAAA 5040 

TAAAACACTC TTCCATATGA TATTCAACAT TTTACAACTG CAGTATTCAC CTAAAGTAGA 5100 

AATAATCTGT TACTTATTGT AAATACTGCC CTAGTGTCTC CATGGACCAA ATTTATATTT 5160 
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ATAATTGTAQ ATTTTTATAT TTTACTACTG AGTCAAGTTT TCTAGTTCTG TGTAATTGTT 5220 

TAGTTTAATQ ACGTAGTTCA TTAGCTGGTC TTACTCTACC AGTTTTCTGA CATTGTATTG 5280 

TGTTACCTAA GTCATTAACT TTGTTTCAGC ATGTAATTTT AACTTTTGTG GAAAATAGAA 5340 

ATACCTTCAT TTTGAAAGAA GTTTTTATGA GAATAACACC TTACCAAACA TTGTTCAAAT 5400 

GGTTTTTATC CAAGGAATTG CAAAAATAAA TATAAATATT GCCATTAAAA AAAAAAAAAA S460 
AAAAAAAAAA AAAAAAAAAA A 

Seq ID NO: 581 Protein sequence: 
Protein Accession #: EOS sequence 

1 11 21 31 41 51 

I I I I I I 

MRILKRFLAC IQLLCVCRLD WANGYYRQQR KLVEEIGWSY TGALNQKNWG KKYPTCNSPK 60 

QSPINIDEDL TQVNVNLKKL KFQGWDKTSL ENTFIHNTGK TVEINLTNDY RVSGGVSEMV 120 

FKASKITFHW GKCNMSSDGS EHSLEGQKFP LEMQIYCFDA DRFSSFEEAV KGKGKLRALS 180 

ILFEVGTEEN LDFKAIIDGV ESVSRFGKQA ALDPFILLNL LPNSTDKYYI YNGSLTSPPC 240 

TDTVDWIVFK DTVSISESQL AVFCEVLTMQ QSGYVMLMDY LQNNFREQQY KFSRQVFSSY 300 

TGKEEIHEAV CSSEPENVQA DPENYTSLLV TWERPRWYD TMIEKPAVLY QQLDGEDQTK 360 

HEFLTDGYQD LGAILNNLLP NMSYVLQIVA ICTNGLYGKY SDQLIVDMPT DNPELDLFPE 420 

LIGTEEIIKE EE EG KD IE EG AIVNPGRDSA TNQIRKKEPQ ISTTTHYNRI GTKYNEAKTN 480 

RSPTRGSEFS GKGDVPNTSL NSTSQPVTKL ATEKDISLTS QTVTELPPHT VEGTSASLND 540 

GSKTVLRSPH MNLSGTAESL NTVSITEYBE ESLLTSFKLD TGAEDSSGSS PATSAIPFIS 600 

ENISQGYIFS SENPETITYD VLIPESARNA SEDSTSSGSE ESLKDPSMEG NVWFPSSTDI 660 

TAQPDVGSGR ESFLQTNYTE IRVDESEKTT KSFSAGPVMS QGPSVTDLEM PHYSTFAYFP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNEASNSS HESRIGLAEG LESEKKAVIP 780 

LVIVSALTFt CLWLVGILI YWRKCFQTAH FYLEDSTSPR VISTPPTPIP PISDDVGAIP 840 

-IKHFPKHVAD LHASSGFTEE FETLKEFYQE VQSCTVDIiGI TADSSNHPDN KHKNRYINIV .900 

AYDHSRVKLA QLAEKDGKLT DYINANYVDG YNRPKAYIAA QGPLKSTAED FWRMIWEHNV 960 

EVIVMITNLV EKGRRKCDQY WPADGSEEYG NFLVTQKSVQ VLAYYTVRNF TIiRNTKIKKG 1020 

SQKGRPSGRV VTQYHYTQWP DMGVPEYSLP VLTFVRKAAY AKRHAVGPW VHCSAGVGRT 1080 

GTYIVLDSML QQIQHEGTVN IFGFLKHIRS QRNYLVQTEE QYVFIHDTLV EAILSKETEV 1140 

LDSHIHAYVM ALLIPGPAGK TKLEKQFQGL TLSPRLECRG TISAHCNLPL PGLTDPPTSA 1200 

SRVAGTILLS QSNIQQSDYS AALKQCNREK NRTSSIIPVE RSRVGISSLS GEGTDYINAS 1260 

YIMGYYQSNE FIITQKPLLH TIKDFWRMIW DHNAQLWMI PDGQNMAEDE FVYWPNKDEP 1320 

INCESFKVTL MAEEHKCLSN EEKLIIQDFI LEATQDDYVti EVRHFQCPKW PNPDSPISKT 1380 

FELISVIKEE AANRDGPMIV HDEHGGVTAG TFCALTTLMH QLEKENSVDV YQVAKMINLM 1440 
RPGVFADIEQ YQFLYKVILS LVGTRQEENP STSLDSNGAA LPDGNIAESL ESLV 



Seq ID NO: 582 DNA sequence 

Nucleic Acid Accession #: NM_002851.1 

Coding sequence: 148.. 7092 
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CACACATACG CACGCACGAT CTCACTTCGA TCTATACACT GGAGGATTAA AACAAACAAA 60 

CAAAAAAAAC ATTTCCTTCG CTCCCCCTCC CTCTCCACTC TGAGAAGCAG AGGAGCCGCA 120 

CGGCGAGGGG CCGCAGACCG TCTGGAAATG CGAATOCTAA AGCGTTTCCT CGCTTGCATT 180 

CAGCTCCTCT GTGTTTGCCG CCTGGATTGG GCTAATGGAT ACTACAGACA ACAGAGAAAA 240 

CTTGTTGAAG AGATTGGCTG GTCCTATACA GGAGCACTGA ATCAAAAAAA TTGGGGAAAG 300 

AAATATCCAA CATGTAATAG CCCAAAACAA TCTCCTATCA ATATTGATGA AGATCTTACA 360 

CAAGTAAATG TGAATCTTAA GAAACTTAAA TTTCAGGGTT GGGATAAAAC ATCATTGGAA 420 

AACACATTCA TTCATAACAC TGGGAAAACA GTGGAAATTA ATCTCACTAA TGACTACCGT 480 

GTCAGCGGAG GAGTTTCAGA AATGGTGTTT AAAGCAAGCA AGATAACTTT TCACTGGGGA 540 

AAATGCAATA TGTCATCTGA TGGATCAGAG CATAGTTTAG AAGGACAAAA ATTTCCACTT 600 

GAGATGCAAA TCTACTGCTT TGATGCGGAC CGATTTTCAA GTTTTGAGGA AGCAGTCAAA 660 

GGAAAAGGGA AGTTAAGAGC TTTATCCATT TTGTTTGAGG TTGGGACAGA AGAAAATTTG 720 

GATTTCAAAG CGATTATTGA TGGAGTCGAA AGTGTTAGTC GTTTTGGGAA GCAGGCTGCT 780 

TTAGATCCAT TCATACTGTT GAACCTTCTG CCAAACTCAA CTGACAAGTA TTACATTTAC 840 

AATGGCTCAT TGACATCTCC TCCCTGCACA GACACAGTTG ACTGGATTGT TTTTAAAGAT 900 

ACAGTTAGCA TCTCTGAAAG CCAGTTGGCT GTTTTTTGTG AAGTTCTTAC AATGCAACAA 960 

TCTGGTTATG TCATGCTGAT GGACTACTTA CAAAACAATT TTCGAGAGCA ACAGTACAAG 1020 

TTCTCTAGAC AGGTGTTTTC CTCATACACT GGAAAGGAAG AGATTCATGA AGCAGTTTGT 1080 

AGTTCAGAAC CAGAAAATGT TCAGGCTGAC CCAGAGAATT ATACCAGCCT TCTTGTTACA 1140 

TGGGAAAGAC CTCGAGTCGT TTATGATACC ATGATTGAGA AGTTTGCAGT TTTGTACCAG 1200 

CAGTTGGATG GAGAGGACCA AACCAAGCAT GAATTTTTGA CAGATGGCTA TCAAGACTTG 1260 

GGTGCTATTC TCAATAATTT GCTACCCAAT ATGAGTTATG TTCTTCAGAT AGTAGCCATA 1320 

TGCACTAATG GCTTATATGG AAAATACAGC GACCAACTGA TTGTCGACAT GCCTACTGAT 1380 

AATCCTGAAC TTGATCTTTT CCCTGAATTA ATTGGAACTG AAGAAATAAT CAAGGAGGAG 1440 

GAAGAGGGAA AAGACATTGA AGAAGGCGCT ATTGTGAATC CTGGTAGAGA CAGTGCTACA 1500 

AACCAAATCA GGAAAAAGGA ACCCCAGATT TCTACCACAA CACACTACAA TCGCATAGGG 1560 

ACGAAATACA ATGAAGCCAA GACTAACCGA TCCCCAACAA GAGGAAGTGA ATTCTCTGGA 1620 

AAGGGTGATG TTCCCAATAC ATCTTTAAAT TCCACTTCCC AACCAGTCAC TAAATTAGCC 1680 

ACAGAAAAAG ATATTTCCTT GACTTCTCAG ACTGTGACTG AACTGCCACC TCACACTGTG 1740 

GAAGGTACTT CAGCCTCTTT AAATGATGGC TCTAAAACTG TTCTTAGATC TCCACATATG 1800 

AACTTGTCGG GGACTGCAGA ATCCTTAAAT ACAGTTTCTA TAACAGAATA TGAGGAGGAG 1860 

AGTTTATTGA CCAGTTTCAA GCTTGATACT GGAGCTGAAG ATTCTTCAGG C TCCA GTCCC 1920 

GCAACTTCTG CTATCCCATT CATCTCTGAG AACATATCCC AAGGGTATAT ATTTTCCTCC 1980 

GAAAACCCAG AGACAATAAC ATATGATGTC CTTATACCAG AATCTGCTAG AAATGCTTCC 2040 

GAAGATTCAA CTTCATCAGG TTCAGAAGAA TCACTAAAGG ATCCTTCTAT GGAGGGAAAT 2100 

GTGTGGTTTC CTAGCTCTAC AGACATAACA GCACAGCCCG ATGTTGGATC AGGCAGAGAG 2160 

AGCTTTCTCC AGACTAATTA CACTGAGATA CGTGTTGATG AATCTGAGAA GACAACCAAG 2220 

TCCTTTTCTG CAGGCCCAGT GATGTCACAG GGTCCCTCAG TTACAGATCT GGAAATGCCA 2280 

CATTATTCTA CCTTTGCCTA CTTCCCAACT GAGGTAACAC CTCATGCTTT TACCCCATCC 2340 

TCCAGACAAC AGGATTTGGT CTCCACGGTC AACGTGGTAT ACTCGCAGAC AACCCAACCG 2400 

GTATACAATG GTGAGACACC TCTTCAACCT TCCTACAGTA GTGAAGTCTT TCCTCTAGTC 2460 

ACCCCTTTGT TGCTTGACAA TCAGATCCTC AACACTACCC CTGCTGCTTC AAGTAGTGAT 2520 
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TCGGCCTTGC ATGCTACGCC TGTATTTCCC AGTGTCGATG TGTCATTTGA ATCCATCCTG 2 580 

TCTTCCTATG ATGGTGCACC TTTGCTTCCA TTTTCCTCTG CTTCCTTCAG TAGTGAATTG 2640 

TTTCGCCATC TGCATACAGT TTCTCAAATC CTTCCACAAG TTACTTCAGC TACCGAGAGT 2700 

GATAAGGTGC CCTTGCATGC TTCTCTGCCA GTGGCTGGGG GTGATTTGCT ATTAGAGCCC 2760 

AGCCTTGCTC AGTATTCTGA TGTGCTGTCC ACTACTCATG CTGCTTCAGA GACGCTGGAA 2820 

TTTGGTAGTG AATCTGGTGT TCTTTATAAA ACGCTTATGT TTTCTCAAGT TGAACCACCC 2 880 

AGCAGTGATG CCATGATGCA TGCACGTTCT TCAGGGCCTG AACCTTCTTA TGCCTTGTCT 2940 

GATAATGAGG GCTCCCAACA CATCTTCACT GTTTCTTACA GTTCTGCAAT ACCTGTGCAT 3000 

GATTCTGTGG GTGTAACTTA TCAGGGTTCC TTATTTAGCG GCCCTAGCCA TATACCAATA 3060 

CCTAAGTCTT OGTTAATAAC CCCAACTGCA TCATTACTGC AGCCTACTCA TGCCCTCTCT 3120 

GGTGATGGGG AATGGTCTGQ AGCCTCTTCT GATAGTGAAT TTCTTTTACC TGACACAGAT 3180 

GGGCTGACAG CCCTTAACAT TTCTTCACCT GTTTCTGTAG CTGAATTTAC ATATACAACA 3240 

TCTGTGTTTG GTGATGATAA TAAGGCGCTT TCTAAAAGTG AAATAATATA TGGAAATGAG 3300 

ACTGAACTGC AAATTCCTTC TTTCAATGAG ATGGTTTACC CTTCTGAAAG CACAGTCATG 3360 

CCCAACATGT ATGATAATGT AAATAAGTTG AATGCGTCTT TACAAGAAAC CTCT GTTTCC 3420 

ATTTCTAGCA CCAAGGGCAT GTTTCCAGGG TCCCTTGCTC ATACCACCAC TAAGGTTTTT 3480 

GATCATGAGA TTAGTCAAGT TCCAGAAAAT AACTTTTCAG TTCAACCTAC ACATACTGTC 3540 

TCTCAAGCAT CTGGTGACAC TTCGCTTAAA CCTGTGCTTA GTGCAAACTC AGAGC CAGCA 3 600 

TCCTCTGACC CTGCTTCTAG TGAAATGTTA TCTCCTTCAA CTCAGCTCTT ATTTTATGAG 3 660 

ACCTCAGCTT CTTTTAGTAC TGAAGTATTG CTACAACCTT CCTTTCAGGC TTCTGATGTT 3720 

GACACCTTGC TTAAAACTGT TCTTCCAGCT GTGCCCAGTG ATCCAATATT GGTTGAAACC 3780 

CCCAAAGTTG ATAAAATTAG TTCTACAATG TTGCATCTCA TTGTATCAAA TTCTGCTTCA 3840 

AGTGAAAACA TGCTGCACTC TACATCTGTA CCAGTTTTTG ATGTGTCGCC TACTTCTCAT 3900 

ATGCACTCTG CTTCACTTCA AGGTTTGACC ATTTCCTATG CAAGTGAGAA ATATGAACCA 3960 

GTTTTGTTAA AAAGTGAAAG TTCCCACCAA GTGGTACCTT CTTTGTACAG TAATGATGAG 4020 

TTGTTCCAAA CGGCCAATTT GGAGATTAAC CAGGCCCATC CCCCAAAAGG AAGGCATGTA 4080 

TTTGCTACAC CTGTTTTATC AATTGATGAA CCATTAAATA CACTAATAAA TAAGCTTATA 4140 

CATTCCGATG AAATTTTAAC CTCCACCAAA AGTTCTGTTA CTGGTAAGGT ATTTGCTGGT 4200 

ATTCCAACAG TTGCTTCTGA TACATTTGTA TCTACTGATC ATTCTGTTCC TATAGGAAAT 4260 

GGGCATGTTG CCATTACAGC TGTTTCTCCC CACAGAGATG GTTCTGTAAC CTCAACAAAG 4320 

TTGCTGTTTC CTTCTAAGGC AACTTCTGAG CTGAGTCATA GTGCCAAATC TGATGCCGGT 4380 

TTAGTGGGTG GTGGTGAAGA TGGTGACACT GATGATGATG GTGATGATGA TGATGACAGA 4440 

GATAGTGATG GCTTATCCAT TCATAAGTGT ATGTCATGCT CATCCTATAG AGAATCACAG 4500 

GAAAAGGTAA TGAATGATTC AGACACCCAC GAAAACAGTC TTATGGATCA GAATAATCCA 4560 

ATCTCATACT CACTATCTGA GAATTCTGAA GAAGATAATA GAGTCACAAG TGTATCCTCA 4620 

GACAGTCAAA CTGGTATGGA CAGAAGTCCT GGTAAATCAC CATCAGCAAA TGGGCTATCC 4680 

CAAAAGCACA ATGATGGAAA AGAGGAAAAT GACATTCAGA CTGGTAGTGC TCTGCTTCCT 4740 

CTCAGCCCTG AATCTAAAGC ATGGGCAGTT CTGACAAGTG ATGAAGAAAG TGGATCAGGG 4800 

CAAGGTACCT CAGATAGCCT TAATGAGAAT GAGACTTCCA CAGATTTCAG TTTTGCAGAC 4860 

ACTAATGAAA AAGATGCTGA TGGGATCCTG GCAGCAGGTG ACTCAGAAAT AACTCCTGGA 4920 

TTCCCACAGT CCCCAACATC ATCTGTTACT AGCGAGAACT CAGAAGTGTT CCACGTTTCA 4980 

GAGGCAGAGG CCAGTAATAG TAGCCATGAG TCTCGTATTG GTCTAGCTGA GGGGTTGGAA 5040 

TCCGAGAAGA AGGCAGTTAT ACCCCTTGTG ATCGTGTCAG CCCTGACTTT TATCTGTCTA 5100 

GTGGTTCTTG TGGGTATTCT CATCTACTGG AGGAAATGCT TCCAGACTGC ACACTTTTAC 5160 

TTAGAGGACA GTACATCCCC TAGAGTTATA TCCACACCTC CAACACCTAT CTTTCCAATT 5220 

TCAGATGATG TCGGAGCAAT TCCAATAAAG CACTTTCCAA AGCATGTTGC AGATTTACAT 5280 

GCAAGTAGTG GGTTTACTGA AGAATTTGAG ACACTGAAAG AGTTTTACCA GGAAGTGCAG 5340 

AGCTGTACTG TTGACTTAGG TATTACAGCA GACAGCTCCA ACCACCCAGA CAACAAGCAC 5400 

AAGAATCGAT ACATAAATAT CGTTGCCTAT GATCATAGCA GGGTTAAGCT AGCACAGCTT 5460 

GCTGAAAAGG ATGGCAAACT GACTGATTAT ATCAATGCCA ATTATGTTGA TGGCTACAAC 5520 

AGACCAAAAG CTTATATTGC TGCCCAAGGC CCACTGAAAT CCACAGCTGA AGATTTCTGG 5580 

AGAATGATAT GGGAACATAA TGTGGAAGTT ATTGTCATGA TAACAAACCT CGTGGAGAAA 5640 

GGAAGGAGAA AATGTGATCA GTACTGGCCT GCCGATGGGA GTGAGGAGTA CGGGAACTTT 5700 

CTGGTCACTC AGAAGAGTGT GCAAGTGCTT GCCTATTATA CTGTGAGGAA TTTTACTCTA 5760 

AGAAACACAA AAATAAAAAA GGGCTCCCAG AAAGGAAGAC CCAGTGGACG TGTGGTCACA 5820 

CAGTATCACT ACACGCAGTG GCCTGACATG GGAGTACCAG AGTACTCCCT GCCAGTGCTG 5B80 

ACCTTTGTGA GAAAGGCAGC CTATGCCAAG CGCCATGCAG TGGGGCCTGT TGTCGTCCAC 5940 

TGCAGTGCTG GAGTTGGAAG AACAGGCACA TATATTGTGC TAGACAGTAT GTTGCAGCAG 6000 

ATTCAACACG AAGGAACTGT CAACATATTT GGCTTCTTAA AACACATCCG TTCACAAAGA 6060 

AATTATTTGG TACAAACTGA GGAGCAATAT GTCTTCATTC ATGATACACT GGTTGAGGCC 6120 

ATACTTAGTA AAGAAACTGA GGTGCTGGAC AGTCATATTC ATGCCTATGT TAATGCACTC 6180 

CTCATTCCTG GACCAGCAGG CAAAACAAAG CTAGAGAAAC AATTCCAGCT CCTGAGCCAG 6240 

TCAAATATAC AGCAGAGTGA CTATTCTGCA GCCCTAAAGC AATGCAACAG GGAAAAGAAT 6300 

CGAACTTCTT CTATCATCCC TGTGGAAAGA TCAAGGGTTG GCATTTCATC CCTGAGTGGA 6360 

GAAGGCACAG ACTACATCAA TGCCTCCTAT ATCATGGGCT ATTACCAGAG CAATGAATTC 6420 

ATCATTACCC AGCACCCTCT CCTTCATACC ATCAAGGATT TCTGGAGGAT GATATGGGAC 6480 

CATAATGCCC AACTGGTGGT TATGATTCCT GATGGCCAAA ACATGGCAGA AGATGAATTT 6540 

GTTTACTGGC CAAATAAAGA TGAGCCTATA AATTGTGAGA GCTTTAAGGT CACT CTTATG 6600 

GCTGAAGAAC ACAAATGTCT ATCTAATGAG GAAAAACTTA TAATTCAGGA CTTTATCTTA 6660 

GAAGCTACAC AGGATGATTA TGTACTTGAA GTGAGGCACT TTCAGTGTCC TAAATGGCCA 6720 

AATCCAGATA GCCCCATTAG TAAAACTTTT GAACTTATAA GTGTTATAAA AGAAGAAGCT 6780 

GCCAATAGGG ATGGGCCTAT GATTGTTCAT GATGAGCATG GAGGAGTGAC GGCAGGAACT 6840 

TTCTGTGCTC TGACAACCCT TATGCACCAA CTAGAAAAAG AAAATTCCGT GGATGTTTAC 6900 

CAGGTAGCCA AGATGATCAA TCTGATGAGG CCAGGAGTCT TTGCTGACAT TGAGCAGTAT 6960 

CAGTTTCTCT ACAAAGTGAT CCTCAGCCTT GTGAGCACAA GGCAGGAAGA GAATCCATCC 7020 

ACCTCTCTGG ACAGTAATGG TGCAGCATTG CCTGATGGAA ATATAGCTGA GAGCTTAGAG 7080 

TCTTTAGTTT AACACAGAAA GGGGTGGGGG GACTCACATC TGAGCATTGT TTTCCTCTTC 7140 

CTAAAATTAG GCAGGAAAAT CAGTCTAGTT CTGTTATCTG TTGATTTCCC ATCACCTGAC 7200 

AGTAACTTTC ATGACATAGG ATTCTGCCGC CAAATTTATA TCATTAACAA TGTGTGCCTT 7260 

TTTGCAAGAC TTGTAATTTA CTTATTATGT TTGAACTAAA ATGATTGAAT TTTACAGTAT 7320 

TTCTAAGAAT GGAATTGTGG TATTTTTTTC TGTATTGATT TTAACAGAAA ATTTCAATTT 7380 

ATAGAGGTTA GGAATTCCAA ACTACAGAAA ATGTTTGTTT TTAGTGTCAA ATTTTTAGCT 7440 

GTATTTGTAG CAATTATCAG GTTTGCTAGA AATATAACTT TTAATACAGT AGCCTGTAAA 7500 

TAAAACACTC TTCCATATGA TATTCAACAT TTTACAACTG CAGTATTCAC CTAAAGTAGA 7560 

AATAATCTGT TACTTATTGT AAATACTGCC CTAGTGTCTC CATGGACCAA ATTTATATTT 7620 

ATAATTGTAG ATTTTTATAT TTTACTACTG AGTCAAGTTT TCTAGTTCTG TGTAATTGTT 7680 

TAGTTTAATG ACGTAGTTCA TTAGCTGGTC TTACTCTACC AGTTTTCTGA CATTGTATTG 7740 
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TGTTACCTAA GTCATTAACT TTGTTTCAGC ATGTAATTTT AACTTTTGTG GA*AAATAGAA 7800 

ATACCTTCAT TTTGAAAGAA GTTTTTATGA GAATAACACC TTACCAAACA TTGTTCAAAT 7860 

GGTTTTTATC CAAGGAATTG CAAAAATAAA TATAAATATT GCCATTAAAA AAAAAAAAAA 7920 
AAAAAAAAAA AAAAAAAAAA A 

Seq ID NO: S83 Protein sequence 
Protein Accession th NP_002842.1 

1 11 21 31 41 51 

I I 1^1 I I 

MRILKRFLAC IQLLCVCRLD WANGYYRQQR KLVEEIGWSY TGALNQKNWG KKYPTCNSPK 60 

QSPINIDEDL TQVNVNLKKL KFQGWDKTSL ENTFIHNTGK TVEINLTNDY RVSGGVSEMV 120 

FKASKITFHW GKCNMSSDGS EHSLEGQKFP LEMQIYCFDA DRFSSFEEAV KGKGKLRALS 180 

ILFEVGTEEN LDFKAI IDGV ESVSRFGKQA ALDPFILLNL LPNSTDKYYI YNGSLTSPPC 240 

TDTVDWIVFK DTVSISESQL AVFCEVLTMQ QSGYVMLMDY LQNNFREQQY KFSRQVFSSY 300 

TGKEEIHEAV CSSEPENVQA DPENYTSLLV TWERPRWYD TMIEKFAVLY QQLDGEDQTK 360 

HEFLTDGYQD LGAILNNLLP NMSYVLQIVA ICTNGLYGKY SDQLIVDMPT DNPELDLFPE 420 

LIGTEEIIKE EEEGKDIEEG AIVNPGRDSA TNQIRKKEPQ ISTTTHYNRI GTKYNEAKTN 480 

RSPTRGSEFS GKGDVPNTSL NSTSQPVTKL ATEKDISLTS QTVTELPPHT VEGTSASLND 540 

GSKTVLRSPH MNLSGTAESL NTVSITEYEE ESLLTSFKLD TGAEDSSGSS PATSAIPFIS 600 

ENISQGYIFS SENPETITYD VLIFESARNA SEDSTSSGSE ESLKDPSMEG NVWFPSSTDI 660 

TAQPDVGSGR ESFLQTNYTE IRVDESEKTT KSFSAGPVMS OGPSVTDLEM PHYSTFAYFP 720 

TEVTPHAFTP SSRQQDLVST VNWYSQTTQ PVYNGETPLQ PSYSSEVFPL VTPLLLDNQI 780 

UJTTPAASSS DSALHATPVF PSVDVSFESI LSSYDGAPLL PFSSASFSSE LFRHLHTVSQ 840 

ILPQVTSATE SDKVPLHASL PVAGGDLLLE PSLAQYSDVL STTHAASETL EFGSESGVLY 900 

KTLMFSQVEP PSSDAMMHAR SSGPEPSYAL SDNEGSQHIF TVSYSSAIPV HDSVGVTYQG 960 

SLFSGPSHIP IPKSSLITPT ASLLQPTHAL SGDGEWSGAS SDSEFLLPDT DGLTALNISS 1020 

PVSVAEFTYT TSVFGDDNKA LSKSEIIYGN ETELQIPSFN EMVYPSESTV MPNMYDNVNK 1080 

LNASLQETSV SISSTKGMFP GSLAHTTTKV FDHEISQVPE NNFSVQPTHT VSQASGDTSL 1140 

KPVLSANSEP ASSDPASSEM LSPSTQLLFY ETSASFSTEV LLQPSFQASD VDTLLKTVLP 1200 

AVPSDPILVE TPKVDKISST MLHLIVSNSA SSEKMLHSTS VPVFDVSPTS HMHSASLQGL 1260 

TISYASEKYE PVLLKSESSH QWPSLYSND ELFQTANLEI NQAHPPKGRH VFATPVLSID 1320 

EPLNTLINKL IHSDEILTST KSSVTGKVFA GIPTVASDTF VSTDHSVPIG NGHVAITAVS 1380 

PHRDGSVTST KLLFPSKATS ELSHSAKSDA GLVGGGEDGD TDDDGDDDDD RDSDGLSIHK 1440 

CMSCSSYRES QEKVMNDSDT HENSLMDQNN PISYSLSENS EEDNRVTSVS SDSQTGMDRS 1500 

PGKSPSANGL SQKHNDGKEE NDIQTGSALL PLSPESKAWA VliTSDEESGS GQGTSDSLNE 1560 

NETSTDFSFA DTNEKDADGI LAAGOSEITP GFPQSPTSSV TSEMSEVFHV SEAEASNSSH 1620 

ESRIGLAEGL ESEKKAVIPL VIVSALTFIC LWLVGILIY WRKCFQTAHF YLEDSTSPRV 1680 

ISTPPTPIFP ISDDVGAIPI KHFPKHVADL HASSGFTEEF ETLKEFYQEV QSCTVDLGIT 1740 

ADSSNHPDNK HKNRYINIVA YDHSRVKLAQ LAEKDGKLTD YINANYVDGY NRPKAYIAAQ 1800 

GPLKSTAEDF WRMIWEHNVE VIVMITNLVE KGRRKCDQYW PADGSEEYGN FLVTQKSVQV 1860 

LAYYTVRNFT LRNTKI KKGS QKGRPSGRW TQYHYTQWPD MGVPEYSLPV LTFVRKAAYA 1920 

KRHAVGPVW HCSAGVGRTG TYIVLDSMLQ QIQHEGTVNI FGFLKHIRSQ RNYLVQTEEQ 1980 

YVFIHDTLVE AILSKETEVL DSHIHAYVNA IiLIPGPAGKT KLEKQFQLLS QSNIQQSDYS 2040 

AALKQCNREK NRTSSIIPVE RSRVGISSLS GEGTDYIKAS YIMGYYQSNE FIITQHPLLH 2100 

TIKDFWRMIW DHNAQLWMI PDGQNMAEDE FVYWPNKDEP INCESFKVTL MAEEHKCLSN 2160 

EEKLXIQDFI LEATQDDYVL EVRHFQCPKW PNPDSPISKT FELISVIKEB AANR0GPMIV 2220 

HDEHGGVTAG TFCALTTLMH QLEKENSVDV YQVAKMINLM RPGVFADIEQ YQFLYKVILS 2280 
LVSTRQEENP STSIJJSNGAA LPDGNIAESL ESLV 

Seq ID NO: 584 DNA sequence 

Nucleic Acid Accession ft: NM_005 688.1 

Coding sequences 126.. 443 9 

1 11 21 31 41 51 

I I I I I I 

CCGGGCAGGT GGCTCATGCT CGGGAGCGTG GTTGAGCGGC TGGCGCGGTT GTCCTGGAGC 60 

AGGGGCGCAG GAATTCTGAT GTGAAACTAA CAGTCTGTGA GCCCTGGAAC CTCCGCTCAG 120 

AGAAGATGAA GGATATCGAC ATAGGAAAAG AGTATATCAT CCCCAGTCCT GGGTATAGAA 180 

GTGTGAGGGA GAGAACCAGC ACTTCTGGGA CGCACAGAGA CCGTGAAGAT TCCAAGTTCA 240 

GGAGAACTCG ACCGTTGGAA TGCCAAGATG CCTTGGAAAC AGCAGCCCGA GCCGAGGGCC 300 

TCTCTCTTGA TGCCTCCATG CATTCTCAGC TCAGAATCCT GGATGAGGAG CATCCCAAGG 360 

GAAAGTACCA TCATGGCTTG AGTGCTCTGA AGCCCATCCG GACTACTTCC AAACACCAGC 420 

ACCCAGTGGA CAATGCTGGG CTTTTTTCCT GTATGACTTT TTCGTGGCTT TCTTCTCTGG 480 

CCCGTGTGGC CCACAAGAAG GGGGAGCTCT CAATGGAAGA CGTGTGGTCT CTGTCCAAGC 540 

ACGAGTCTTC TGACGTGAAC TGCAGAAGAC TAGAGAGACT GTGGCAAGAA GAGCTGAATG 600 

AAGTTGGGCC AGACGCTGCT TCCCTGCGAA GGGTTGTGTG GATCTTCTGC CGCACCAGGC 660 

TCATCCTGTC CATCGTGTGC CTGATGATCA CGCAGCTGGC TGGCTTCAGT GGACCAGCCT 720 

TCATGGTGAA ACACCTCTTG GAGTATACCC AGGCAACAGA GTCTAACCTG CAGTACAGCT 780 

TGTTGTTAGT GCTGGGCCTC CTCCTGACGG AAATCGTGCG GTCTTGGTCG CTTGCACTGA 840 

CTTGGGCATT GAATTACCGA ACCGGTGTCC GCTTGCGGGG GGCCATCCTA ACCATGGCAT 900 

TTAAGAAGAT CCTTAAGTTA AAGAACATTA AAGAGAAATC CCTGGGTGAG CTCATCAACA 960 

TTTGCTCCAA OGATGGGCAG AGAATGTTTG AGGCAGCAGC CGTTGGCAGC CTGCTGGCTG 1020 

GAGGACCCGT TGTTGCCATC TTAGGCATGA TTTATAATGT AATTATTCTG GGACCAACAG 1080 

GCTTCCTGGG ATCAGCTGTT TTTATCCTCT TTTACCCAGC AATGATGTTT GCATCACGGC 1140 

TCACAGCATA TTTCAGGAGA AAATGCGTGG CCGCCACGGA TGAACGTGTC CAGAAGATGA 1200 

ATGAAGTTCT TACTTACATT AAATTTATCA AAATGTATGC CTGGGTCAAA GCATTTTCTC 1260 

AGAGTGTTCA AAAAATCCGC GAGGAGGAGC GTCGGATATT GGAAAAAGCC GGGTACTTCC 1320 

AGGGTATCAC TGTGGGTGTG GCTCCCATTG TGGTGGTGAT TGCCAGCGTG GTGACCTTCT 1380 

CTGTTCATAT GACCCTGGGC TTCGATCTGA CAGCAGCACA GGCTTTCACA GTGGTGACAG 1440 

TCTTCAATTC CATGACTTTT GCTTTGAAAG TAACACCGTT TTCAGTAAAG TCCCTCTCAG 1500 

AAGCCTCAGT GGCTGTTGAC AGATTTAAGA GTTTGTTTCT AATGGAAGAG GTTCACATGA 1560 

TAAAGAACAA ACCAGCCAGT CCTCACATCA AGATAGAGAT GAAAAATGCC ACCTTGGCAT 1620 

GGGACTCCTC CCACTCCAGT ATCCAGAACT CGCCCAAGCT GACCCCCAAA ATGAAAAAAG 1680 

ACAAGAGGGC TTCCAGGGGC AAGAAAGAGA AGGTGAGGCA GCTGCAGCGC ACTGAGCATC 1740 

AGGCGGTGCT GGCAGAGCAG AAAGGCCACC TCCTCCTGGA CAGTGACGAG CGGCCCAGTC 1800 

CCGAAGAGGA AGAAGGCAAG CACATCCACC TGGGCCACCT GCGCTTACAG AGGACACTGC 1860 
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ACAGCATCGA TCTGGAGATC CAAGAGGGTA AACTGGTTGG AATCTGCGGC AGTGTGGGAA 1920 

GTGGAAAAAC CTCTCTCATT TCAGCCATTT TAGGCCAGAT GACGCTTCTA GAGGGCAGCA 1980 

TTGCAATCAG TGGAACCTTC GCTTATGTGG CCCAGCAGGC CTGGATCCTC AATGCTACTC 2040 

TGAGAGACAA CATCCTGTTT GGGAAGGAAT ATGATGAAGA AAGATACAAC TCTGTGCTGA 2100 

ACAGCTGCTG CCTGAGGCCT GACCTGGCCA TTCTTCCCAG CAGCGACCTG ACGGAGATTG 2160 

GAGAGCGAGG AGCCAACCTG AGCGGTGGGC AGCGCCAGAG GATCAGCCTT GCCCGGGCCT 2220 

TGTATAGTGA CAGGAGCATC TACATCCTGG ACGACCCCCT CAGTGCCTTA GATGCCCATG 2280 

TGGGCAACCA CATCTTCAAT AGTGCTATCC GGAAACATCT CAAGTCCAAG ACAGTTCTGT 2340 

TTGTTACCCA CCAGTTACAG TACCTGGTTG ACTGTGATGA AGTGATCTTC ATGAAAGAGG 2400 

GCTGTATTAC GGAAAGAGGC ACCCATGAGG AACTGATGAA TTTAAATGGT GACTATGCTA 2460 

CCATTTTTAA TAACCTGTTG CTGGGAGAGA CACCGCCAGT TGAGATCAAT TCAAAAAAGG 2520 

AAACCAGTGG TTCACAGAAG AAGTCACAAG ACAAGGGTCC TAAAACAGGA TCAGTAAAGA 2580 

AGGAAAAAGC AGTAAAGCCA GAGGAAGGGC AGCTTGTGCA GCTGGAAGAG AAAGGGCAGG 2640 

GTTCAGTGCC CTGGTCAGTA TATGGTGTCT ACATCCAGGC TGCTGGGGGC CCCTTGGCAT 2700 

TCCTGGTTAT TATGGCCCTT TTCATOCTGA ATGTAGGCAG CACCGCCTTC AGCACCTGGT 2760 

GGTTGAGTTA CTGGATCAAG CAAGGAAGCG GGAACACCAC TGTGACTCGA GGGAACGAGA 2820 

CCTCGGTGAG TGACAGCATG AAGGACAATC CTCATATGCA GTACTATGCC AGCATCTACG 2880 

CCCTCTCCAT GGCAGTCATG CTGATCCTGA AAGCCATTCG AGGAGTTGTC TTTGTCAAGG 2940 

GCACGCTGOG AGCTTCCTCC CGGCTGCATG ACGAGCTTTT CCGAAGGATC CTTCGAAGCC 3000 

CTATGAAGTT TTTTGACACG ACCCCCACAG GGAGGATTCT CAACAGGTTT TCCAAAGACA 3060 

TGGATGAAGT TGACGTGCGG CTGCOGTTCC AGGCCGAGAT GTTCATCCAG AACGTTATCC 3120 

TGGTGTTCTT CTGTGTGGGA ATGATCGCAG GAGTCTTCCC GTGGTTCCTT GTGGCAGTGG 3180 

GGCCCCTTGT CATCCTCTTT TCAGTCCTGC ACATTGTCTC CAGGGTCCTG ATTCGGGAGC 3240 

TGAAGCGTCT GGACAATATC ACGCAGTCAC CTTTCCTCTC CCACATCACG TCCAGCATAC 3300 

AGGGCCTTGC CACCATCCAC GCCTACAATA AAGGGCAGGA GTTTCTGCAC AGATACCAGG 3360 

AGCTGCTGGA TGACAACCAA GCTCCTTTTT TTTTGTTTAC GTGTGCGATG CGGTGGCTGG 3420 

CTGTGCGGCT GGACCTCATC AGCATCGCCC TCATCACCAC CACGGGGCTG ATGATCGTTC 3480 

TTATGCACGG GCAGATTCCC CCAGCCTATG CGGGTCTCGC CATCTCTTAT GCTGTCCAGT 3540 

TAACGGGGCT GTTCCAGTTT ACGGTCAGAC TGGCATCTGA GACAGAAGCT CGATTCACCT 3600 

CGGTGGAGAG GATCAATCAC TACATTAAGA CTCTGTCCTT GGAAGCACCT GCCAGAATTA 3 660 

AGAACAAGGC TCCCTCCCCT GACTGGCCCC AGGAGGGAGA GGTGACCTTT GAGAACGCAG 3720 

AGATGAGGTA CCGAGAAAAC CTCCCTCTTG TCCTAAAGAA AGTATCCTTC ACGATCAAAC 3780 

CTAAAGAGAA GATTGGCATT GTGGGGCGGA CAGGATCAGG GAAGTCCTCG CTGGGGATGG 3840 

CCCTCTTCCG TCTGGTGGAG TTATCTGGAG GCTGCATCAA GATTGATGGA GTGAGAATCA 3900 

GTGATATTGG CCTTGCCGAC CTCCGAAGCA AACTCTCTAT CATTCCTCAA GAGCCGGTGC 3960 

TGTTCAGTGG CACTGTCAGA TCAAATTTGG ACCCCTTCAA CCAGTACACT GAAGACCAGA 4020 

TTTGGGATGC CCTGGAGAGG ACACACATGA AAGAATGTAT TGCTCAGCTA CCTCTGAAAC 4080 

TTGAATCTGA AGTGATGGAG AATGGGGATA ACTTCTCAGT GGGGGAACGG CAGCTCTTGT 4140 

GCATAGCTAG AGCCCTGCTC CGCCACTGTA AGATTCTGAT TTTAGATGAA GCCACAGCTG 4200 

CCATGGACAC AGAGACAGAC TTATTGATTC AAGAGACCAT CCGAGAAGCA TTTGCAGACT 4260 

GTACCATGCT GACCATTGCC CATCGCCTGC ACACGGTTCT AGGCTCCGAT AGGATTATGG 4320 

TGCTGGCCCA GGGACAGGTG GTGGAGTTTG ACACCCCATC GGTCCTTCTG TCCAACGACA 4380 

GTTCCCGATT CTATGCCATG TTTGCTGCTG CAGAGAACAA GGTCGCTGTC AAGGGCTGAC 4440 

TCCTCCCTGT TGACGAAGTC TCTTTTCTTT AGAGCATTGC CATTCCCTGC CTGGGGCGGG 4500 

CCCCTCATCG CGTCCTCCTA CCGAAACCTT GCCTTTCTCG ATTTTATCTT TCGCACAGCA 4560 

GTTCCGGATT GGCTTGTGTG TTTCACTTTT AGGGAGAGTC ATATTTTGAT TATTGTATTT 4620 

ATTCCATATT CATGTAAACA AAATTTAGTT TTTGTTCTTA ATTGCACTCT AAAAGGTTCA 4680 

GGGAACCGTT ATTATAATTG TATCAGAGGC CTATAATGAA GCTTTATACG TGTAGCTATA 4740 

TCTATATATA ATTCTGTACA TAGCCTATAT TTACAGTGAA AATGTAAGCT GTTTATTTTA 4800 

TATTAAAATA AGCACTGTGC TAATAACAGT GCATATTCCT TTCTATCATT TTTGTACAGT 4860 

TTGCTGTACT AGAGATCTGG TTTTGCTATT AGACTGTAGG AAGAGTAGCA TTTCATTCTT 4920 

CTCTAGCTGG TGGTTTCACG GTGCCAGGTT TTCTGGGTGT CCAAAGGAAG ACGTGTGGCA 4980 

ATAGTGGGCC CTCCGACAGC CCCCTCTGCC GCCTCCCCAC AGCCGCTCCA GGGGTGGCTG 5040 

GAGACGGGTG GGCGGCTGGA GACCATGCAG AGCGCCGTGA GTTCTCAGGG CTCCTGCCTT 5100 

CTGTCCTGGT GTCACTTACT GTTTCTGTCA GGAGAGCAGC GGGGCGAAGC CCAGGCCCCT S160 

TTTCACTCCC TCCATCAAGA ATGGGGATCA CAGAGACATT CCTCCGAGCC GGGGAGTTTC 5220 

TTTCCTGCCT TCTTCTTTTT GCTGTTGTTT CTAAACAAGA ATCAGTCTAT CCACAGAGAG 5280 

TCCCACTGCC TCAGGTTCCT ATGGCTGGCC ACTGCACAGA GCTCTCCAGC TCCAAGACCT 5340 

GTTGGTTCCA AGCCCTGGAG CCAACTGCTG CTTTTTGAGG TGGCACTTTT TCATTTGCCT 5400 

ATTCCCACAC CTCCACAGTT CAGTGGCAGG GCTCAGGATT TCGTGGGTCT GTTTTCCTTT 5460 

CTCACCGCAG TCGTCGCACA GTCTCTCTCT CTCTCTCCCC TCAAAGTCTG CAACTTTAAG 5520 

CAGCTCTTGC TAATCAGTGT CTCACACTGG CGTAGAAGTT TTTGTACTGT AAAGAGACCT 5580 

ACCTCAGGTT GCTGGTTGCT GTGTGGTTTG GTGTGTTCCC GCAAACCCCC TTTGTGCTGT 5640 

GGGGCTGGTA GCTCAGGTGG GCGTGGTCAC TGCTGTCATC AGTTGAATGG TCAGCGTTGC 5700 

ATGTOGTGAC CAACTAQACA TTCTGTCGCC TTAGCATGTT TGCTGAACAC CTTGTGGAAG 5760 

CAAAAATCTG AAAATGTGAA TAAAATTATT TTGGATTTTG TAAAAAAAAA AAAAAAAAAA 5820 
AAAAAAAAAA AAAAAAAA 

Seq id NO i 585 Protein sequence 
Protein Accession fh NP_005679.1 

1 11 21 31 41 51 

I I I I I I 

MKDIDIGKEY IIPSPGYRSV RBRTSTSGTH RDREDSKFRR TRPLECQDAL ETAARAEGLS 60 

LDASMHSQLR ILDEEHPKGK YHHGLSALKP IRTTSKHQHP VDNAGLFSCM TFSWLSSLAR 120 

VAHKKGELSM EDVWSLSKHB SSDVNCRRLE RLWQEELNEV GPDAASLRRV VWIFCRTRLI 180 

LSIVCLMITQ LAGPSGPAFM VKHLLEYTQA TESNLQYSLL LVLGLLLTEI VRSWSLALTW 240 

ALNYRTGVRL RGAILTMAFK KILKLKNIKE KSLGELINIC SNDGQRMFEA AAVGSLLAGG 300 

PWAILGMIY NVIILGPTGP LGSAVFILPY PAMMPASRLT AYFRRKCVAA TDERVQKMNE 360 

VLTYIKFIKM YAWVKAPSQS VQKIREEERR ILEKAGYFQG ITVGVAPIW VIASWTFSV 420 

HMTLGFDLTA AQAFTWTVF NSMTFALKVT PFSVKSLSEA SVAVDRFKSL FLMEEVHMIK 480 

NKPASPHIKI EMKNATLAWD SSHSSIQNSP KLTPKMKKDK RASRGKKEKV RQLQRTEHQA 540 

VtAEQKGHLL LDSDERPSPE EEEGKHIHLG HLRLQRTLHS IDLEIQEGKL VGICGSVGSG 600 

KTSLISAILG QMTLLEGSIA ISGTFAYVAQ QAWILNATLR DNILFGKEYD EERYNSVLNS 660 

CCLRPDLAIL PSSDLTEIGE RGANLSGGQR QRISLARALY SDRS1YILDD PLSALDAHVG 720 

NHIFNSAIRK HLKSKTVLFV THQLQYLVDC DEVIFMKEGC ITERGTHEEL MNLNGDYATI 780 

FNNLLLGETP PVEINSKKET SGSQKKSQDK GPKTGSVKKE KAVKPEEGQL VQLEEKGQGS 840 
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VPWSVYGVYI QAAGGPLAFL VIMALFMLNV GSTAFSTWWL 
VSDSMKDNPH MQYYASIYAL SMAVMLILKA IRGWFVKGT 
KFFDTTPTGR ILNRFSKDMD EVDVRLPFQA EMFIQNVILV 
LVILFSVLHI VSRVLIRELK RLDNITQSPF LSHITSSIQG 
LDDNQAPFFL PTCAMRWLAV RLDLISIALI TTTGLMIVLM 
GLFQFTVRLA SETEARFTSV ERINHYIKTL SLEAPARIKN 
RYRENLPLVL KKVSFTIKPK EKIGIVGRTG SGKSSLGMAL 
IGLADLRSKL SIIPQEPVLF SGTVRSNLDP FNDYTEDQIW 
SEVMENGDNF SVGERQLLCI ARALLRHCKI LILDEATAAM 
MLTIAHRLHT VLGSDRIMVL AQGQWEFDT PSVLLSNDSS 

Seq ID NO: 586 DNA sequence 

Nucleic Acid Accession #: NMJJ01327.1 

Coding sequence : 89 . . 631 



PCT/US02/12476 



AGCAGGGGGC 
CTGAGAGCCG 
GACGGGCGAT 
TGGCGGCCCA 
AAGGGCCTCG 
GCTGAATGGA 
CCTCGCCATG 
GGATGCCCCA 
CATACTGACT 
CTGTCTCCAG 
GGCTCAGCCT 
GCCTCCTCCC 
GTTTGTCGCT 



11 
I 

GCTGTGTGTA 
GGCAGAGGCT 
GCTGATGGCC 
GGAGAGGCGG 
GGGCCGGGAG 
TGCTGCAGAT 
CCTTTCGCGA 
CCGCTTCCCG 
ATCCGACTGA 
CAGCTTTCCC 
CCCTCAGGGC 
CTAGGGAATG 
GGAGGAGGAC 



21 
I 

CCGAGAATAC 
CCGGAGCCAT 
CAGGAGGCCC 
GTGCCACGGG 
GAGGCGCCCC 
GCGGGGCCAG 
CACCCATGGA 
TGCCAGGGGT 
CTGCTGCAGA 
TGTTGATGTG 
AGAGGCGCTA 
GTCCCAGCAC 
GGCTTACATG 



31 
I 

GAGAATACCT 
GCAGGCCGAA 
TGGCATTCCT 
CGGCAGAGGT 
GCGGGGTCCG 
GGGGCCGGAG 
AGCAGAGCTG 
GCTTCTGAAG 
CCACCGCCAA 
GATCACGCAG 
AGCCCAGCCT 
GAGTGGCCAG 
TTTGTTTCTG 



SYWIKQGSGN 
LRASSRLHDE 
FFCVGMIAGV 
LATIHAYNKG 
HGQIPPAYAG 
KAPSPDWPQE 
FRLVELSGGC 
DALERTHMKE 
DTETDLLIQE 
RFYAMFAAAE 



41 

i 

CGTGGGCCCT 
GGCCGGGGCA 
GATGGCCCAG 
CCCCGGGGCG 
CATGGCGGCG 
AGCCGCCTGC 
GCCCGCAGGA 
GAGTTCACTG 
CTGCAGCTCT 
TGCTTTCTGC 
GGCGCCCCTT 
TTCATTGTGG 
TAGAAAATAA 



TTVTRGNETS 
LFRRILRSPM 
FPWFLVAVGP 
QEFLHRYQEL 
LAISYAVQLT 
GEVTFENAEM 
IKIDGVRISD 
CIAQLPLKLE 
TIREAFADCT 
NKVAVKG 



51 
I 

GACCTTCTCT 
CAGGGGGTTC 
GGGGCAATGC 
CAGGGGCAGC 
CGGCTTCAGG 
TTGAGTTCTA 
GCCTGGCCCA 
TGTCCGGCAA 
CCATCAGCTC 
CCGTGTTTTT 
CCTAGGTCAT 
GGGCCTGATT 
AACTGAGCTA 



Seq ID NO: 587 Protein sequence 
Protein Accession ft: NP 001318.1 



11 



21 



31 

I 



41 



51 



MOAEGRGTGG STGDADGPGG PGIPDGPGGN AGGPGEAGAT GGRGPRGAGA ARASGPGGGA 
PRGPHGGAAS GLNGCCRCGA RGPESRLLEF YLAMPFATPM EAELARRSLA QDAPPLPVPG 
VLLKEFTVSG NILTIRLTAA DHRQLQLSIS SCLQQLSLLM WITQCFLPVP LAQPPSGQRR 

Seq ID NO: 588 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 52.. 459 



CCTCGTGGGC 
GAAGGCCAGG 
CCTGATGGCC 
GGTCCCCGGG 
CCGCATGGCG 
GACAGCCGCC 
ATCAGCTCCT 
GTGTTTTTGG 
TAGGTCATGC 
GCCTGATTGT 
CTGAGCTA 



11 
I 

CCTGACCTTC 
GCACAGGGGG 
CAGGGGGCAA 
GCGCAGGGGC 
GTGCCGCTTC 
TGCTTCAGTT 
GTCTCCAGCA 
CTCAGGCTCC 
CTCCTCCCCT 
TTGTCGCTGG 



21 
I 

TCTCTGAGAG 
TTCGACGGGC 
TGCTGGCGGC 
AGCAAGGGCC 
TGCGCAGGAT 
CCGACTGACT 
GCTTTCCCTG 
CTCAGGGCAG 
AGGGAATGGT 
AGGAGGACGG 



31 
I 

CCGGGCAGAG 
GATGCTGATG 
CCAGGAGAGG 
TCGGGGCCGA 
GGAAGGTGCC 
GCTGCAGACC 
TTGATGTGGA 
AGGCGCTAAG 
CCCAGCACGA 
CTTACATGTT 



41 
I 

GCTCCGGAGC 
GCCCAGGAGG 
CGGGTGCCAC 
GAGGAGGCGC 
CCTGCGGGGC 
ACCGCCAACT 
TCACGCAGTG 
CCCAGCCTGG 
GTGGCCAGTT 
TGTTTCTGTA 



51 

I 

CATGCAGGCC 
CCCTGGCATT 
GGGCGGCAGA 
CCCGCGGGGT 
CAGGAGGCCG 
GCAGCTCTCC 
CTTTCTGCCC 
CGCCCCTTCC 
CATTGTGGGG 
GAAAATAAAG 



Seq ID NO: 589 Protein sequence 
Protein Accession #: Eos sequence 



1 11 21 31 41 51 

I 1 I I I I 

MQAEGQGTGG STGDADGPGG PGIPDGPGGN AGGPGEAGAT GGRGPRGAGA ARASGPRGGA 
PRGPHGGAAS AQDGRCPCGA RRPDSRLLQF RLTAADHRQL QLSISSCLQQ LSLLMWITQC 
FLPVFLAQAP SGQRR 

Seq ID NO: 590 DNA sequence 
Nucleic Acid Accession #: NMJJ05562.1 
Coding sequence: 90.. 3671 



1 
I 

ACAGCGGAGC 
AGACAGAGAC 
GCTTCTCGCT 
ATGGGAAGTC 
TCCGCTGCCT 
GCTTTTACCG 
CTCTTAGTGC 
CCAGATGCGA 
ACCAGAGACT 
ACGCGGGCCG 
CAGGTTACTA 
GGCATTCAGC 
TTCATCAAGA 
AATGGTCACA 



11 

I 

GCAGAGTGAG 
TGAGCGGCCC 
CCTCCTGCCC 
CAGGCAGTGT 
CAACTGCAAT 
GCACAGAGAA 
TCGATGTGAC 
CCGATGTCTG 
GCTAGACTCC 
CTGTGTCTGC 
TAATCTGGAT 
CAGCTGCCGC 
TGTTGATGGC 
GCGCCATCAA 



21 
I 

AACCACCAAC 
GGCACCGCCA 
GCAGCCCGGG 
ATCTTTGATC 
GACAACACTG 
AGGGACCGCT 
AACTCTGGAC 
CCAGGCTTCC 
AAGTGTGACT 
AAGCCAGCTG 
GGGGGGAACC 
AGCTCTGCAG 
TGGAAGGCTG 
GATGTGTTTA 



31 

I 

CGAGGCGCCG 
TGCCTGCGCT 
CCACCTCCAG 
GGGAACTTCA 
ATGGCATTCA 
GTTTGCCCTG 
GGTGCAGCTG 
ACATGCTCAC 
GTGACCCAGC 
TTACTGGAGA 
CTGAGGGCTG 
AATACAGTGT 
TCCAACGAAA 
GCTCAGCCCA 



41 
I 

GGCAGCGACC 
CTGGCTGGGC 
GAGGGAAGTC 
CAGACAAACT 
CTGCGAGAAG 
CAATTGTAAC 
TAAACCAGGT 
GGATGCGGGG 
TGGCATCGCA 
ACGCTGTGAT 
TACCCAGTGT 
CCATAAGATC 
TGGGTCTCCT 
ACGACTAGAC 



51 
I 

CCTGCAGCGG 
TGCTGCCTCT 
TGTGATTGCA 
GGTAATGGAT 
TGCAAGAATG 
TCCAAAGGTT 
GTGACAGGAG 
TGCACCCAAG 
GGGCCCTGTG 
AGGTGTCGAT 
TTCTGCTATG 
ACCTCTACCT 
GCAAAGCTCC 
CCTGTCTATT 
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TTGTQGCTCC TGCCAAATTT CTTGGGAATC AACAGGTGAG CTATGGGCAA AGCCTGTCCT 900 

TTGACTACCG TGTGGACAGA GQAGGCAGAC ACCCATCTGC CCATGATGTG ATTCTGGAAG 960 

GTGCTGGTCT ACGGATCACA GCTCCCTTGA TGCCACTTGG CAAGACACTG CCTTGTGGGC 1020 

TCACCAAGAC TTACACATTC AGGTTAAATG AGCATCCAAG CAATAATTGG AGCCCCCAGC 1080 

TGAGTTACTT TGAGTATCGA AGGTTACTGC GGAATCTCAC AGCCCTCCGC ATCCGAGCTA 1140 

CATATGGAGA ATACAGTACT GGGTACATTG ACAATGTGAC CCTGATTTCA GCCCGCCCTG 1200 

TCTCTGGAGC CCCAGCACCC TGGGTTGAAC AGTGTATATG TCCTGTTGGG TACAAGGGGC 1260 

AATTCTGCCA GGATTGTGCT TCTGGCTACA AGAGAGATTC AGCGAGACTG GGGCCTTTTG 1320 

GCACCTGTAT TCCTTGTAAC TGTCAAGGGG GAGGGGCCTG TGATCCAGAC ACAGGAGATT 1380 

GTTATTCAGG GGATGAGAAT CCTGACATTG AGTGTGCTGA CTGCCCAATT GGTTTCTACA 1440 

ACGATCCGCA CGACCCCCGC AGCTGCAAGC CATGTCCCTG TCATAACGGG TTCAGCTGCT 1500 

CAGTGATGCC GGAGACGGAG GAGGTGGTGT GCAATAACTG CCCTCCCGGG GTCACCGGTG 1560 

CCCGCTGTGA GCTCTGTGCT GATGGCTACT TTGGGGACCC CTTTGGTGAA CATGGCCCAG 1620 

TGAGGCCTTG TCAGCCCTGT CAATGCAACA ACAATGTGGA CCCCAGTGCC TCTGGGAATT 1680 

GTGACCGGCT GACAGGCAGG TGTTTGAAGT GTATCCACAA CACAGCCGGC ATCTACTGCG 1740 

ACCAGTGCAA AGCAGGCTAC TTCGGGGACC CATTGGCTCC CAACCCAGCA GACAAGTGTC 1800 

GAGCTTGCAA CTGTAACCCC ATGGGCTCAG AGCCTGTAGG ATGTCGAAGT GATGGCACCT 1860 

GTGTTTGCAA GCCAGGATTT GGTGGCCCCA ACTGTGAGCA TGGAGCATTC AGCTGTCCAG 1920 

CTTGCTATAA TCAAGTGAAG ATTCAGATGG ATCAGTTTAT GCAGCAGCTT CAGAGAATGG 1980 

AGGCCCTGAT TTCAAAGGCT CAGGGTGGTG ATGGAGTAGT ACCTGATACA GAGCTGGAAG 2040 

GCAGGATGCA GCAGGCTGAG CAGGCCCTTC AGGACATTCT GAGAGATGCC CAGATTTCAG 2100 

AAGGTGCTAG CAGATCCCTT GGTCTCCAGT TGGCCAAGGT GAGGAGCCAA GAGAACAGCT 2160 

ACCAGAGCCG CCTGGATGAC CTCAAGATGA CTGTGGAAAG AGTTCGGGCT CTGGGAAGTC 2220 

AGTACCAGAA CCGAGTTCGG GATACTCACA GGCTCATCAC TCAGATGCAG CTGAGCCTGG 2280 

CAGAAAGTGA AGCTTCCTTG GGAAACACTA ACATTCCTGC CTCAGACCAC TACGTGGGGC 2340 

CAAATGGCTT TAAAAGTCTG GCTCAGGAGG CCACAAGATT AGCAGAAAGC CACGTTGAGT 2400 

CAGCCAGTAA CATGGAGCAA CTGACAAGGG AAACTGAGGA CTATTCCAAA CAAGCCCTCT 2460 

CACTGGTGCG CAAGGCCCTG CATGAAGGAG TCGGAAGCGG AAGCGGTAGC CCGGAOGGTG 2520 

CTGTGGTGCA AGGGCTTGTG GAAAAATTGG AGAAAACCAA GTCCCTGGCC CAGCAGTTGA 2580 

CAAGGGAGGC CACTCAAGCG GAAATTGAAG CAGATAGGTC TTATCAGCAC AGTCTCCGCC 2 640 

TCCTGGATTC AGTGTCTCGG CTTCAGGGAG TCAGTGATCA GTCCTTTCAG GTGGAAGAAG 2700 

CAAAGAGGAT CAAACAAAAA GCGGATTCAC TCTCAACGCT GGTAACCAGG CATATGGATG 2760 

AGTTCAAGCG TACACAAAAG AATCTGGGAA ACTGGAAAGA AGAAGCACAG CAGCTCTTAC 2820 

AGAATGGAAA AAGTGGGAGA GAGAAATCAG ATCAGCTGCT TTCCCGTGCC AATCTTGCTA 2880 

AAAGCAGAGC ACAAGAAGCA CTGAGTATGG GCAATGCCAC TTTTTATGAA GTTGAGAGCA 2940 

TCCTTAAAAA CCTCAGAGAG TTTGACCTGC AGGTGGACAA CAGAAAAGCA GAAGCTGAAG 3000 

AAGCCATGAA GAGACTCTCC TACATCAGCC AGAAGGTTTC AGATGCCAGT GACAAGACCC 3060 

AGCAAGCAGA AAGAGCCCTG GGGAGCGCTG CTGCTGATGC ACAGAGGGCA AAGAATGGGG 3120 

CCGGGGAGGC CCTGGAAATC TCCAGTGAGA TTGAACAGGA GATTGGGAGT CTGAACTTGG 3180 

AAGCCAATGT GACAGCAGAT GGAGCCTTGG CCATGGAAAA GGGACTGGCC TCTCTGAAGA 3240 

GTGAGATGAG GGAAGTGGAA GGAGAGCTGG AAAGGAAGGA GCTGGAGTTT GACACGAATA 3300 

TGGATGCAGT ACAGATGGTG ATTACAGAAG CCCAGAAGGT TGATACCAGA GCCAAGAACG 3360 

CTGGGGTTAC AATCCAAGAC ACACTCAACA CATTAGACGG CCTCCTGCAT CTGATGGACC 3420 

AGCCTCTCAG TGTAGATGAA GAGGGGCTGG TCTTACTGGA GCAGAAGCTT TCCCGAGCCA 3480 

AGACCCAGAT CAACAGCCAA CTGCGGCCCA TGATGTCAGA GCTGGAAGAG AGGGCACGTC 3540 

AGCAGAGGGG CCACCTCCAT TTGCTGGAGA CAAGCATAGA TGGGATTCTG GCTGATGTGA 3600 

AGAACTTGGA GAACATTAGG GACAACCTGC CCCCAGGCTG CTACAATACC CAGGCTCTTG 3660 

AGCAACAGTG AAGCTGCCAT AAATATTTCT CAACTGAGGT TCTTGGGATA CAGATCTCAG 3720 

GGCTCGGGAG CCATGTCATG TGAGTGGGTG GGATGGGGAC ATTTGAACAT GTTTAATGGG 3780 

TATGCTCAGG TCAACTGACC TGACCCCATT CCTGATCCCA TGGCCAGGTG GTTGTCTTAT 3840 

TGCACCATAC TCCTTGCTTC CTGATGCTGG GCAATGAGGC AGATAGCACT GGGTGTGAGA 3900 

ATGATCAAGG ATCTGGACCC CAAAGAATAG ACTGGATGGA AAGACAAACT GCACAGGCAG 3960 

ATGTTTGCCT CATAATAGTC GTAAGTGGAG TCCTGGAATT TGGACAAGTG CTGTTGGGAT 4020 

ATAGTCAACT TATTCTTTGA GTAATGTGAC TAAAGGAAAA AACTTTGACT TTGCCCAGGC 4080 

ATGAAATTCT TCCTAATGTC AGAACAGAGT GCAACCCAGT CACACTGTGG CCAGTAAAAT 4140 

ACTATTGCCT CATATTGTCC TCTGCAAGCT TCTTGCTGAT CAGAGTTCCT CCTACTTACA 4200 

ACCCAGGGTG TGAACATGTT CTCCATTTTC AAGCTGGAAG AAGTGAGCAG TGTTGGAGTG 4260 

AGGACCTGTA AGGCAGGCCC ATTCAGAGCT ATGGTGCTTG CTGGTGCCTG CCACCTTCAA 4320 

GTTCTGGACC TGGGCATGAC ATCCTTTCTT TTAATGATGC CATGGCAACT TAGAGATTGC 4380 

ATTTTTATTA AAGCATTTCC TACCAGCAAA GCAAATGTTG GGAAAGTATT TACTTTTTCG 4440 

GTTTCAAAGT GATAGAAAAG TGTGGCTTGG GCATTGAAAG AGGTAAAATT CTCTAGATTT 4500 

ATTAGTCCTA ATTCAATCCT ACTTTTCGAA CACCAAAAAT GATGCGCATC AATGTATTTT 4560 

ATCTTATTTT CTCAATCTCC TCTCTCTTTC CTCCACCCAT AATAAGAGAA TGTTCCTACT 4620 

CACACTTCAG CTGGGTCACA TCCATCCCTC CATTCATCCT TCCATCCATC TTTCCATCCA 4680 

TTACCTCCAT CCATCCTTCC AACATATATT TATTGAGTAC CTACTGTGTG CCAGGGGCTG 4740 

GTGGGACAGT GGTGACATAG TCTCTGCCCT CATAGAGTTG ATTGTCTAGT GAGGAAGACA 4800 

AGCATTTTTA AAAAATAAAT TTAAACTTAC AAACTTTGTT TGTCACAAGT GGTGTTTATT 4860 

GCAATAACCG CTTGGTTTGC AACCTCTTTG CTCAACAGAA CATATGTTGC AAGACCCTCC 4920 

CATGGGGGCA CTTGAGTTTT GGCAAGGCTG ACAGAGCTCT GGGTTGTGCA CATTTCTTTG 4980 

CATTCCAGCT GTCACTCTGT GCCTTTCTAC AACTGATTGC AACAGACTGT TGAGTTATGA 5040 

TAACACCAGT GGGAATTGCT GGAGGAACCA GAGGCACTTC CACCTTGGCT GGGAAGACTA 5100 

TGGTGCTGCC TTGCTTCTGT ATTTCCTTGG ATTTTCCTGA AAGTGTTTTT AAATAAAGAA 5160 
CAATTGTTAG ATGCC 

Seq ID NO i 591 Protein sequence 
Protein Accession if: NP_005S53.1 

1 11 21 31 41 51 

| | I I I I 

MPALWLGCCL CFSLLLPAAR ATSRREVCDC NGKSRQCIFD RELHRQTGNG FRCLNCNDNT 60 

DGIHCEKCKN GFYRHRERDR CLPCNCNSKG SLSARCDNSG RCSCKPGVTG ARCDRCLPGF 120 

HMLTDAGCTQ DQRLLDSKCD CDPAGIAGPC DAGRCVCKPA VTGERCDRCR SGYYNLDGGN 180 

PEGCTQCPCY GHSASCRSSA EYSVHKITST FHQDVDGWKA VQRNGSPAKL QWSQRHQDVF 240 

SSAQRLDPVY FVAPAKFLGN QQVSYGQSLS FDYRVDRGGR HPSAHDVILE GAGLRITAPL 300 

MPLGKTLPCG LTXTYTFRLN EHPSNNWSPQ LSYFEYRRLL RNLTALRIRA TYGEYSTGYI 360 

DNVTLISARP VSGAPAPWVE QCICPVGYKG QFCODCASGY KRDSARLGPF GTCIPCNCQG 420 

GGACDPDTGD CYSGDENPDI ECADCPIGFY NDPHDPRSCK PCPCHNGFSC SVMPETEEW 480 



415 
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CNNCPPGVTG ARCELCADGY FGDPFGEHGP VRPCQPCQCN NNVDPSASGN CDRLTGRCLK 540 

CIHNTAGIYC DQCKAGYFGD PLAPNPADKC RACNCNPNGS EPVGCRSDGT CVCKPGFGGP 600 

NCEHGAFSCP ACYNQVKIQM DQFMQQLQRM EALISKAQGG DGWPDTELE GRMQQAEQAL 660 

QDILRDAQIS EGASRSLGLQ LAKVRSQENS YQSRLDDLKM TVERVRALGS QYQNRVRDTH 720 

RLITQMQLSL AESEASLGNT NIPASDHYVG PNGFKSLAQE ATRLAESHVE SASNMEQLTR 780 

ETEDYSKQAL SLVRKALHEG VGSGSGSPDG AWQGLVEKL EKTKSLAQQL TREATQAEIE 840 

ADRSYQHSLR LLDSVSRLQG VSDQSFQVEE AKRIKQKADS LSTLVTRHMD EFKRTQKNLG 900 

NWKEEAQQLL QNGKSGREKS DQLLSRANLiA KSRAQEALSM GNATFYEVES ILKNLREFDL 960 

QVDNRKAEAE EAMKRLSYIS QKVSDASDKT QQAERALGSA AADAQRAKNG AGEALEISSE 1020 

IEQEIGSLNL EANVTADGAL AMEKGLASLK SEMREVEGEL ERKELEFDTN MDAVQMVITE 1080 

AQKVDTRAKN AGVTIQDTLN TLDGLLHLMD QPLSVDEEGL VLLEQKLSRA KTQINSQLRP 1140 
MMSELEERAR QQRGHLHLLE TSIDGILADV KNLENXRDNL PPGCYNTQAIi EQQ 

Seg 10 NO: 592 DNA sequence 
Nucleic Acid Accession ft: AF101051.1 
Coding sequence: 221.856 

1 11 21 31 41 51 

I I I I I I 

GAGCAACCTC AGCTTCTAGT ATCCAGACTC CAGCGCCGCC CCGGGCGCGG ACCCCAACCC 60 

CGACCCAGAG CTTCTCCAGC GGCGGCGCAG CGAGCAGGGC TCCCCGCCTT AACTTCCTCC 120 

GCGGGGCCCA GCCACCTTCG GGAGTCCGGG TTGCCCACCT GCAAACTCTC CGCCTTCTGC 180 

ACCTGCCACC CCTGAGCCAG CGCGGGCGCC CGAGCGAGTC ATGGCCAACG CGGGGCTGCA 240 

GCTGTTGGGC TTCATTCTCG CCTTCCTGGG ATGGATCGGC GCCATCGTCA GCACTGCCCT 300 

GCCCCAGTGG AGGATTTACT CCTATGCCGG CGACAACATC GTGACCGCCC AGGCCATGTA 360 

CGAGGGGCTG TGGATGTCCT GCGTGTCGCA GAGCACCGGG CAGATCCAGT GCAAAGTCTT 420 

TGACTCCTTG CTGAATCTGA GCAGCACATT GCAAGCAACC CGTGCCTTGA TGGTGGTTGG 480 

CATCCTCCTG GGAGTGATAG CAATCTTTGT GGCCACCGTT GGCATGAAGT GTATGAAGTG 540 

CTTGGAAGAC GATGAGGTGC AGAAGATGAG GATGGCTGTC ATTGGGGGTG CGATATTTCT 600 

TCTTGCAGGT CTGGCTATTT TAGTTGCCAC AGCATGGTAT GGCAATAGAA TCGTTCAAGA 660 

ATTCTATGAC CCTATGACCC CAGTCAATGC CAGGTACGAA TTTGGTCAGG CTCTCTTCAC 720 

TGGCTGGGCT GCTGCTTCTC TCTGCCTTCT GGGAGGTGCC CTACTTTGCT GTTCCTGTCC 780 

CCGAAAAACA ACCTCTTACC CAACACCAAG GCCCTATCCA AAACCTGCAC CTTCCAGCGG 840 

GAAAGACTAC GTGTGACACA GAGGCAAAAG GAGAAAATCA TGTTGAAACA AACCGAAAAT 900 

GGACATTGAG ATACTATCAT TAACATTAGG ACCTTAGAAT TTTGGGTATT GTAATCTGAA 960 

GTATGGTATT ACAAAACAAA CAAACAAACA AAAAACCCAT GTGTTAAAAT ACTCAGTGCT 1020 

AAACATGGCT TAATCTTATT TTATCTTCTT TCCTCAATAT AGGAGGGAAG ATTTTACCAT 1080 

TTGTATTACT GCTTCCCATT GAGTAATCAT ACTCAAATGG GGGAAGGGGT GCTCCTTAAA 1140 

TATATATAGA TATGTATATA TACATGTTTT TCTATTAAAA ATAGACAGTA AAATACTATT 1200 

CTCATTATGT TGATACTAGC ATACTTAAAA TATCTCTAAA ATAGGTAAAT GTATTTAATT 1260 

CCATATTGAT GAAGATGTTT ATTGGTATAT TTTCTTTTTC GTCCTTATAT ACATATGTAA 1320 

CAGTCAAATA TCATTTACTC TTCTTCATTA GCTTTGGGTG CCTTTGCCAC AAGACCTAGC 1380 

CTAATTTACC AAGGATGAAT TCTTTCAATT CTTCATGCGT GCCCTTTTCA TATACTTATT 1440 

TTATTTTTTA CCATAATCTT ATAGCACTTG CATCGTTATT AAGCCCTTAT TTGTTTTGTG 1500 

TTTCATTGGT CTCTATCTCC TGAATCTAAC ACATTTCATA GCCTACATTT TAGTTTCTAA 1560 

AGCCAAGAAG AATTTATTAC AAATCAGAAC TTTGGAGGCA AATCTTTCTG CATGACCAAA 1620 

GTGATAAATT CCTGTTGACC TTCCCACACA ATCCCTGTAC TCTGACCCAT AGCACTCTTG 1680 

TTTGCTTTGA AAATATTTGT CCAATTGAGT AGCTGCATGC TGTTCCCCCA GGTGTTGTAA 1740 

CACAACTTTA TTGATTGAAT TTTTAAGCTA CTTATTCATA GTTTTATATC CCCCTAAACT 1800 

ACCTTTTTGT TCCCCATTCC TTAATTGTAT TGTTTTCCCA AGTGTAATTA TCATGCGTTT 1860 

TATATCTTCC TAATAAGGTG TGGTCTGTTT GTCTGAACAA AGTGCTAGAC TTTCTGGAGT 1920 

GATAATCTGG TGACAAATAT TCTCTCTGTA GCTGTAAGCA AGTCACTTAA TCTTTCTACC 1980 

TCTTTTTTCT ATCTGCCAAA TTGAGATAAT GATACTTAAC CAGTTAGAAG AGGTAGTGTG 2040 

AATATTAATT AGTTTATATT ACTCTCATTC TTTGAACATG AACTATGCCT ATGTAGTGTC 2100 

TTTATTTGCT CAGCTGGCTG AGACACTGAA GAAGTCACTG AACAAAACCT ACACACGTAC 2160 

CTTCATGTGA TTCACTGCCT TCCTCTCTCT ACCAGTCTAT TTCCACTGAA CAAAACCTAC 2220 

ACACATACCT TCATGTGGTT CAGTGCCTTC CTCTCTCTAC CAGTCTATTT CCACTGAACA 2280 

AAACCTACGC ACATACCTTC ATGTGGCTCA GTGCCTTCCT CTCTCTACCA GTCTATTTCC 2340 

ATTCTTTCAG CTGTGTCTGA CATGTTTGTG CTCTGTTCCA TTTTAACAAC TGCTCTTACT 2400 

TTTCCAGTCT GTACAGAATG CTATTTCACT TGAGCAAGAT GATGTATGGA AAGGGTGTTG 2460 

GCACTGGTGT CTGGAGACCT GGATTTGAGT CTTGGTGCTA TCAATCACCG TCTGTGTTTG 2520 

AGCAAGGCAT TTGGCTGCTG TAAGCTTATT GCTTCATCTG TAAGCGGTGG TTTGTAATTC 2580 

CTGATCTTCC CACCTCACAG TGATGTTGTG GGGATCCAGT GAGATAGAAT ACATGTAAGT 2640 

GTGGTTTTGT AATTTGAAAA GTGCTATACT AAGGGAAAGA ATTGAGGAAT TAACTGCATA 2700 

CGTTTTGGTG TTGCTTTTCA AATGTTTGAA AATAAAAAAA TGTTAAGAAA TGGGTTTCTT 2760 

GCCTTAACCA GTCTCTCAAG TGATGAGACA GTGAAGTAAA ATTGAGTGCA CTAAACGAAT 2820 

AAGATTCTGA GGAAGTCTTA TCTTCTGCAG TGAGTATGGC CCAATGCTTT CTGTGGCTAA 2880 

ACAGATGTAA TGGGAAGAAA TAAAAGCCTA CGTGTTGGTA AATCCAACAG CAAGGGAGAT 2940 

TTTTGAATCA TAATAACTCA TAAGGTGCTA TCTGTTCAGT GATGCCCTCA GAGCTCTTGC 3000 

TGTTAGCTGG CAGCTGACGC TGCTAGGATA GTTAGTTTGG AAATGGTACT TCATAATAAA 3060 

CTACACAAGG AAAGTCAGCC ACCGTGTCTT ATGAGGAATT- GGACCTAATA AATTTTAGTG 3120 

TGCCTTCCAA ACCTGAGAAT ATATGCTTTT GGAAGTTAAA ATTTAAATGG CTTTTGCCAC 3180 

ATACATAGAT CTTCATGATG TGTGAGTGTA ATTCCATGTG GATATCAGTT ACCAAACATT 3240 

ACAAAAAAAT TTTATGGCCC AAAATGACCA ACGAAATTGT TACAATAGAA TTTATCCAAT 3300 

TTTGATCTTT TTATATTCTT CTACCACACC TGGAAACAGA CCAATAGACA TTTTGGGGTT 3360 

TTATAATGGG AATTTGTATA AAGCATTACT CTTTTTCAAT AAATTGTTTT TTAATTTAAA 3420 
AAAAGGAAAA AAAAAAAAAA AAA 

Seq ID NO: 593 Protein sequence 
Protein Accession ft: AAD16433.1 

1 11 21 31 41 51 

1 I I I I I 

MANAGLQLLG FILAFLGWIG AIVSTALPQW RIYSYAGDNI VTAQAMYEGL WMSCVSQSTG 60 

QIQCKVFDSL LNLSSTLQAT RALMWGILL GVIA1FVATV GMKCMKCLED DEVQKMRMAV 120 

IGGAI FLLAG LAILVATAWY GNRIVQEFYD PMTPVNARYE FGQALFTGWA AASLCLLGGA 180 
LLCCSCPRKT TSYPTPRPYP KPAP.SSGKDY V 
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Seq ID NO: 594 DNA sequence 

Nucleic Acid Accession #; NMJJ061B0.1 

Coding sequence t 352 . . 2820 

1 11 21 31 41 51 

I I I I I I 

CCCCCATTCG CATCTAACAA GGAATCTGCG CCCCAGAGAG TCCCGGACGC CGCCGGTCGG 60 

TGCCCGGCGC GCCGGGCCAT GCAGCGACGG CCGCCGCGGA GCTCCGAGCA GCGGTAGCGC 120 

CCCCCTGTAA AGCGGTTCGC TATGCCGGGA CCACTGTGAA CCCTGCCGCC TGCCGGAACA 180 

CTCTTCGCTC CGGACCAGCT CAGCCTCTGA TAAGCTGGAC TCGGCACGCC CGCAACAAGC 240 

ACCGAGGAGT TAAGAGAGCC GCAAGCGCAG GGAAGGCCTC CCCGCACGGG TGGGGGAAAG 300 

CGGCCGGTGC AGCGCGGGGA CAGGCACTCG GGCTGGCACT GGCTGCTAGG GATGTCGTCC 360 

TGGATAAGGT GGCATGGACC CGCCATGGCG CGGCTCTGGG GCTTCTGCTG GCTGGTTGTG 420 

GGCTTCTGGA GGGCCGCTTT CGCCTGTCCC ACGTCCTGCA AATGCAGTGC CTCTCGGATC 480 

TGGTGCAGCG ACCCTTCTCC TGGCATCGTG GCATTTCCGA GATTGGAGCC TAACAGTGTA 540 

GATCCTGAGA ACATCACCGA AATTTTCATC GCAAACCAGA AAAGGTTAGA AATCATCAAC 600 

GAAGATGATG TTGAAGCTTA TGTGGGACTG AGAAATCTGA CAATTGTGGA TTCTGGATTA 660 

AAATTTGTGG CTCATAAAGC ATTTCTGAAA AACAGCAACC TGCAGCACAT CAATTTTACC 720 

CGAAACAAAC TGACGAGTTT GTCTAGGAAA CATTTCCGTC ACCTTGACTT GTCTGAACTG 780 

ATCCTGGTGG GCAATCCATT TACATGCTCC TGTGACATTA TGTGGATCAA GACTCTCCAA 840 

GAGGCTAAAT CCAGTCCAGA CACTCAGGAT TTGTACTGCC TGAATGAAAG CAGCAAGAAT 900 

ATTCCCCTGG CAAACCTGCA GATACCCAAT TGTGGTTTGC CATCTGCAAA TCTGGCCGCA 960 

CCTAACCTCA CTGTGGAGGA AGGAAAGTCT ATCACATTAT CCTGTAGTGT GGCAGGTGAT 1020 

CCGGTTCCTA ATATGTATTG GGATGTTGGT AACCTGGTTT CCAAACATAT GAATGAAACA 1080 

AGCCACACAC AGGGCTCCTT AAGGATAACT AACATTTCAT CCGATGACAG TGGGAAGCAG 1140 

ATCTCTTGTG TGGCGGAAAA TCTTGTAGGA GAAGATCAAG ATTCTGTCAA CCTCACTGTG 1200 

CATTTTGCAC CAACTATCAC ATTTCTCGAA TCTCCAACCT CAGACCACCA CTGGTGCATT 1260 

CCATTCACTG TGAAAGGCAA CCCCAAACCA GCGCTTCAGT GGTTCTATAA CGGGGCAATA 1320 

TTGAATGAGT CCAAATACAT CTGTACTAAA ATACATGTTA CCAATCACAC GGAGTACCAC 1380 

GGCTGCCTCC AGCTGGATAA TCCCACTCAC ATGAACAATG GGGACTACAC TCTAATAGCC 1440 

AAGAATGAGT ATGGGAAGGA TGAGAAACAG ATTTCTGCTC ACTTCATGGG CTGGCCTGGA 1500 

ATTGACGATG GTGCAAACCC AAATTATCCT GATGTAATTT ATGAAGATTA TGGAACTGCA 1560 

GCGAATGACA TCGGGGACAC CACGAACAGA AGTAATGAAA TCCCTTCCAC AGACGTCACT 1620 

GATAAAACCG GTCGGGAACA TCTCTCGGTC TATGCTGTGG TGGTGATTGC GTCTGTGGTG 1680 

GGATTTTGCC TTTTGGTAAT GCTGTTTCTG CTTAAGTTGG CAAGACACTC CAAGTTTGGC 1740 

ATGAAAGGCC CAGCCTCCGT TATCAGCAAT GATGATGACT CTGCCAGCCC ACTCCATCAC 1800 

ATCTCCAATG GGAGTAACAC TCCATCTTCT TCGGAAGGTG GCCCAGATGC TGTCATTATT 1860 

GGAATGACCA AGATCCCTGT CATTGAAAAT CCCCAGTACT TTGGCATCAC CAACAGTCAG 1920 

CTCAAGCCAG ACACATTTGT TCAGCACATC AAGCGACATA ACATTGTTCT GAAAAGGGAG 1980 

CTAGGCGAAG GAGCCTTTGG AAAAGTGTTC CTAGCTGAAT GCTATAACCT CTGTCCTGAG 2040 

CAGGACAAGA TCTTGGTGGC AGTGAAGACC CTGAAGGATG CCAGTGACAA TGCACGCAAG 2100 

GACTTCCACC GTGAGGCCGA GCTCCTGACC AACCTCCAGC ATGAGCACAT CGTCAAGTTC 2160 

TATGGCGTCT GCGTGGAGGG CGACCCCCTC ATCATGGTCT TTGAGTACAT GAAGCATGGG 2220 

GACCTCAACA AGTTCCTCAG GGCACACGGC CCTGATGCCG TGCTGATGGC TGAGGGCAAC 2280 

CCGCCCACGG AACTGACGCA GTCGCAGATG CTGCATATAG COCAGCAGAT CGCCGCGGGC 2340 

ATGGTCTACC TGGCGTCCCA GCACTTCGTG CACCGCGATT TGGCCACCAG GAACTGCCTG 2400 

GTCGGGGAGA ACTTGCTGGT GAAAATCGGG GACTTTGGGA TGTCCCGGGA CGTGTACAGC 2460 

ACTGACTACT ACAGGGTCGG TGGCCACACA ATGCTGCCCA TTCGCTGGAT GCCTCCAGAG 2520 

AGCATCATGT ACAGGAAATT CACGACGGAA AGCGACGTCT GGAGCCTGGG GGTCGTGTTG 2580 

TGGGAGATTT TCACCTATGG CAAACAGCCC TGGTACCAGC TGTCAAACAA TGAGGTGATA 2640 

GAGTGTATCA CTCAGGGCCG AGTCCTGCAG CGACCCCGCA CGTGCCCCCA GGAGGTGTAT 2700 

GAGCTGATGC TGGGGTGCTG GCAGOGAGAG CCCCACATGA GGAAGAACAT CAAGGGCATC 2760 

CATACCCTCC TTCAGAACTT GGCCAAGGCA TCTCCGGTCT ACCTGGACAT TCTAGGCTAG 2820 

GGCCCTTTTC CCCAGACCGA TCCTTCCCAA CGTACTCCTC AGACGGGCTG AGAGGATGAA 2880 

CATCTTTTAA CTGCCGCTGG AGGCCACCAA GCTGCTCTCC TTCACTCTGA CAGTATTAAC 2940 

ATCAAAGACT CCGAGAAGCT CTCGAGGGAA GCAGTGTGTA CTTCTTCATC CATAGACACA 3000 

GTATTGACTT CTTTTTGGCA TTATCTCTTT CTCTCTTTCC ATCTCCCTTG GTTGTTCCTT 3060 

TTTCTTTTTT TAAATTTTCT TTTTCTTCTT TTTTTTCGTC TTCCCTGCTT CACGATTCTT 3120 

ACCCTTTCTT TTGAATCAAT CTGGCTTCTG CATTACTATT AACTCTGCAT AGACAAAGGC 3180 

CTTAACAAAC GTAATTTGTT ATATCAGCAG ACACTCCAGT TTGCCCACCA CAACTAACAA 3240 

TGCCTTGTTG TATTCCTGCC TTTGATGTGG ATGAAAAAAA GGGAAAACAA ATATTTCACT 3300 

TAAACTTTGT CACTTCTGCT GTACAGATAT CGAGAGTTTC TATGGATTCA CTTCTATTTA 3360 

TTTATTATTA TTACTGTTCT TATTGTTTTT GGATGGCTTA AGCCTGTGTA TAAAAAAGAA 3420 

AACTTGTGTT CAATCTGTGA AGCCTTTATC TATGGGAGAT TAAAACCAGA GAGAAAGAAG 3480 

ATTTATTATG AACCGCAATA TGGGAGGAAC AAAGACAACC ACTGGGATCA GCTGGTGTCA 3540 

GTCCCTACTT AGGAAATACT CAGCAACTGT TAGCTGGGAA GAATGTATTC GGCACCTTCC 3600 

CCTGAGGACC TTTCTGAGGA GTAAAAAGAC TACTGGCCTC TGTGCCATGG ATGATTCTTT 3660 
TCCCATCACC AGAAATGATA GCGTGCAGTA GAGAGCAAAG ATGGCTT 

Seq ID NO: 595 Protein sequence 
Protein Accession 8x NP_006171.1 

1 11 21 31 41 51 

I ! I I I I 

MSSWIRWHGP AMARLWGFCW LWGFWRAAF ACPTSCKCSA SRIWCSDPSP GIVAFPRIjEP 60 

NSVDPENITE IFIANQKRLE IINEDDVEAY VGLRNLTIVD SGLKFVAHKA FLKNSNLQHI 120 

NPTRNKLTSL SRKHPRHLDL SELILVGNPF TCSCDIMWIK TLQEAKSSPD TQDLYCLNES 180 

SKNIPLANLQ IPNCGLPSAN LAAPNLTVEE GKSITLSCSV AGDPVPNMYW DVGNLVSKHM 240 

NETSHTQGSL RITNISSDDS GKQISCVAEN LVGEDQDSVN LTVHFAPTIT FLESPTSDHH 300 

WCIPPTVKGN PKPALQWFYN GAILNESKYI CTKIHVTNHT EYHGCLQLDN PTHMNNGDYT 360 

LIAKNEYGKD EKQISAHFMG WPGIDDGANP NYPDVIYEDY GTAANDIGDT TNHSNEIPST 420 

DVTDKTGREH LSVYAWVIA SWGFCLLVM LFLLKLARHS KFGMKGPASV ISNDDDSASP 480 

LHHISNGSNT PSSSEGGPDA VIIGMTKIPV IENPQYFGIT NSQLKPDTFV QHIKRHNIVL 540 

KRELGEGAFG KVPtiAECYNL CPEQDKILVA VKTLKDASDN ARKDFHREAE LLTNLQHEHI 600 

VKFYGVCVEG DPLIMVFEYM KHGDLNKPLR AHGPDAVLMA EGNPPTELTQ SQMLHIAQQI 660 

AAGMVYLASQ HFVHRDLATR NCLVGENLLV KIGDFGMSRD VYSTDYYRVG GHTMLPIRWM 720 
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WO 02/086443 

PPESIMYRKF TTBSDVWSLG WLWEIFTYG KQPWYQLSNN EVIECITQGR VLQRPRTCPQ 
EVYELMLGCW QREPHMRKNI KGIHTLLQNL AKASPVYLDI LG 



780 



Seq ID NO j S96 DNA sequence 
Nucleic Acid Accession #: AF410899 
Coding sequence: 483.. 2999 

1 11 21 31 41 51 

I I I I I I 

GGGAGCAGGA GCCTCGCTGG CTGCTTCGCT CGCGCTCTAC GCGCTCAGTC CCCGGCGGTA 60 

GCAGGAGCCT GGACCCAGGC GCCGGCGGCG GGCGTGAGGC GCCGGAGCCC GGCCTCGAGG 120 

TGCATACCGG ACCCCCATTC GCATCTAACA AGGAATCTGC GCCCCAGAGA GTCCCGGACG 180 

CCGCCGGTCG GTGCCCGGCG CGCCGGGCCA TGCAGCGACG GCCGCCGCGG AGCTCCGAGC 240 

AGCGGTAGCG CCCCCCTGTA AAGCGGTTCG CTATGCCGGG ACCACTGTGA ACCCTGCCGC 300 

CTGCCGGAAC ACTCTTCGCT CCGGACCAGC TCAGCCTCTG ATAAGCTGGA CTCGGCACGC 360 

CCGCAACAAG CACCGAGGAG TTAAGAGAGC CGCAAGCGCA GGGAAGGCCT CCCCGCACGG 420 

GTGGGGGAAA GCGGCCGGTG CAGCGCGGGG ACAGGCACTC GGGCTGGCAC TGGCTGCTAG 480 

GGATGTOGTC CTGGATAAGG TGGCATGGAC CCGCCATGGC GCGGCTCTGG GGCTTCTGCT 540 

GGCTGGTTGT GGGCTTCTGG AGGGCCGCTT TCGCCTGTCC CACGTCCTGC AAATGCAGTG 600 

CCTCTCGGAT CTGGTGCAGC GACCCTTCTC CTGGCATCGT GGCATTTCCG AGATTGGAGC 660 

CTAACAGTGT AGATCCTGAG AACATCACCG AAATTTTCAT CGCAAACCAG AAAAGGTTAG 720 

AAATCATCAA CGAAGATGAT GTTGAAGCTT ATGTGGGACT GAGAAATCTG ACAATTGTGG 780 

ATTCTGGATT AAAATTTGTG GCTCATAAAG CATTTCTGAA AAACAGCAAC CTGCAGCACA 840 

TCAATTTTAC CCGAAACAAA CTGACGAGTT TGTCTAGGAA ACATTTCCGT CACCTTGACT 900 

TGTCTGAACT GATCCTGGTG GGCAATCCAT- TTACATGCTC CTGTGACATT ATGTGGATCA 960 

AGACTCTCCA AGAGGCTAAA TCCAGTOCAG ACACTCAGGA TTTGTACTGC CTGAATGAAA 1020 

GCAGCAAGAA TATTCCCCTG GCAAACCTGC AGATACCCAA TTGTGGTTTG CCATCTGCAA 1080 

ATCTGGCCGC ACCTAACCTC ACTGTGGAGG AAGGAAAGTC TATCACATTA TCCTGTAGTG 1140 

TGGCAGGTGA TCCGGTTCCT AATATGTATT GGGATGTTGG TAACCTGGTT TCCAAACATA 1200 

TGAATGAAAC AAGCCACACA CAGGGCTCCT TAAGGATAAC TAACATTTCA TCCGATGACA 1260 

GTGGGAAGCA GATCTCTTGT GTGGCGGAAA ATCTTGTAGG AGAAGATCAA GATTCTGTCA 1320 

ACCTCACTGT GCATTTTGCA CCAACTATCA CATTTCTCGA ATCTCCAACC TCAGACCACC 1380 

ACTGGTGCAT TCCATTCACT GTGAAAGGCA ACCCCAAACC AGCGCTTCAG TGGTTCTATA 1440 

ACGGGGCAAT ATTGAATGAG TCCAAATACA TCTGTACTAA AATACATGTT ACCAATCACA 1500 

CGGAGTACCA CGGCTGCCTC CAGCTGGATA ATCCCACTCA CATGAACAAT GGGGACTACA 1560 

CTCTAATAGC CAAGAATGAG TATGGGAAGG ATGAGAAACA GATTTCTGCT CACTTCATGG 1620 

GCTGGCCTGG AATTGACGAT GGTGCAAACC CAAATTATCC TGATGTAATT TATGAAGATT 1680 

ATGGAACTGC AGCGAATGAC ATCGGGGACA CCACGAACAG AAGTAATGAA ATCCCTTCCA 1740 

CAGACGTCAC TGATAAAACC GGTCGGGAAC ATCTCTCGGT CTATGCTGTG GTGGTGATTG 1800 

CGTCTGTGGT GGGATTTTGC CTTTTGGTAA TGCTGTTTCT GCTTAAGTTG GCAAGACACT 1860 

CCAAGTTTGG CATGAAAGAT TTCTCATGGT TTGGATTTGG GAAAGTAAAA TCAAGACAAG 1920 

GTGTTGGCCC AGCCTCCGTT ATCAGCAATG ATGATGACTC TGCCAGCCCA CTCCATCACA 1980 

TCTCCAATGG GAGTAACACT CCATCTTCTT CGGAAGGTGG CCCAGATGCT GTCATTATTG 2040 

GAATGACCAA GATCCCTGTC ATTGAAAATC CCCAGTACTT TGGCATCACC AACAGTCAGC 2100 

TCAAGCCAGA CACATTTGTT CAGCACATCA AGCGACATAA CATTGTTCTG AAAAGGGAGC 2160 

TAGGCGAAGG AGCCTTTGGA AAAGTGTTCC TAGCTGAATG CTATAACCTC TGTCCTGAGC 2220 

AGGACAAGAT CTTGGTGGCA GTGAAGACCC TGAAGGATGC CAGTGACAAT GCACGCAAGG 2280 

ACTTCCACCG TGAGGCCGAG CTCCTGACCA ACCTCCAGCA TGAGCACATC GTCAAGTTCT 2340 

ATGGCGTCTG CGTGGAGGGC GACCCCCTCA TCATGGTCTT TGAGTACATG AAGCATGGGG 2400 

ACCTCAACAA GTTCCTCAGG GCACACGGCC CTGATGCCGT GCTGATGGCT GAGGGCAACC 2460 

CGCCCACGGA ACTGACGCAG TCGCAGATGC TGCATATAGC CCAGCAGATC GCCGCGGGCA 2520 

TGGTCTACCT GGCGTCCCAG CACTTCGTGC ACCGCGATTT GGCCACCAGG AACTGCCTGG 2580 

TCGGGGAGAA CTTGCTGGTG AAAATCGGGG ACTTTGGGAT GTCCCGGGAC GTGTACAGCA 2640 

CTGACTACTA CAGGGTCGGT GGCCACACAA TGCTGCCCAT TCGCTGGATG CCTCCAGAGA 2700 

GCATCATGTA CAGGAAATTC ACGACGGAAA GCGACGTCTG GAGCCTGGGG GTCGTGTTGT 2760 

GGGAGATTTT CACCTATGGC AAACAGCCCT GGTACCAGCT GTCAAACAAT GAGGTGATAG 2820 

AGTGTATCAC TCAGGGCCGA GTCCTGCAGC GACCCCGCAC GTGCCCCCAG GAGGTGTATG 2880 

AGCTGATGCT GGGGTGCTGG CAGCGAGAGC CCCACATGAG GAAGAACATC AAGGGCATCC- 2940 

ATACCCTCCT TCAGAACTTG GCCAAGGCAT CTCCGGTCTA CCTGGACATT CTAGGCTAGG 3000 

GCCCTTTTCC CCAGACCGAT CCTTCCCAAC GTACTCCTCA GACGGGCTGA GAGGATGAAC 3060 

ATCTTTTAAC TGCCGCTGGA GGCCACCAAG CTGCTCTCCT TCACTCTGAC AGTATTAACA 3120 

TCAAAGACTC CGAGAAGCTC TCGAGGGAAG CAGTGTGTAC TTCTTCATCC ATAGACACAG 3180 

TATTGACTTC TTTTTGGCAT TATCTCTTTC TCTCTTTCCA TCTCCCTTGG TTGTTCCTTT 3240 

TTCTTTTTTT AAATTTTCTT TTTCTTCTTT TTTTTCGTCT TCCCTGCTTC ACGATTCTTA 3300 

CCCTTTCTTT TGAATCAATC TGGCTTCTGC ATTACTATTA ACTCTGCATA GACAAAGGCC 3360 

TTAACAAACG TAATTTGTTA TATCAGCAGA CACTCCAGTT TGCCCACCAC AACTAACAAT 3420 

GCCTTGTTGT ATTCCTGCCT TTGATGTGGA TGAAAAAAAG GGAAAACAAA TATTTCACTT 3480 

AAACTTTGTC ACTTCTGCTG TACAGATATC GAGAGTTTCT ATGGATTCAC TTCTATTTAT 3540 

TTATTATTAT TACTGTTCTT ATTGTTTTTG GATGGCTTAA GCCTGTGTAT AAAAAAGAAA 3600 

ACTTGTGTTC AATCTGTGAA GCCTTTATCT ATGGGAGATT AAAACCAGAG AGAAAGAAGA 3660 

TTTATTATGA ACCGCAATAT GGGAGGAACA AAGACAACCA CTGGGATCAG CTGGTGTCAG 3720 

TCCCTACTTA GGAAATACTC AGCAACTGTT AGCTGGGAAG AATGTATTCG GCACCTTCCC 3780 

CTGAGGACCT TTCTGAGGAG TAAAAAGACT ACTGGCCTCT GTGCCATGGA TGATTCTTTT 3840 

CCCATCACCA GAAATGATAG CGTGCAGTAG AGAGCAAAGA TGGCTTCCGT GAGACACAAG 3900 

ATGGCGCATA GTGTGCTCGG ACACAGTTTT GTCTTCGTAG GTTGTGATGA TAGCACTGGT 3960 

TTGTTTCTCA AGCGCTATCC ACAGAACCTT TGTCAACTTC AGTTGAAAAG AGGTGGATTC 4020 
ATGTCCAGAG CTCATTTCGG GGTCAGGTGG GAAAGCC 



Seq ID NO: 597 Protein sequence 
Protein Accession #: AAL67965.1 



1 11 21 31 41 51 

I I I I I I 

MSSWIRWHGP AMARLWGFCW LWGFWRAAF ACPTSCKCSA SRIWCSDPSP GIVAFPRLEP 60 

NSVDPENITE IFIANQKRLE IINEDDVEAY VGLRNLTIVD SGLKFVAHKA FLKNSNLQHI 120 

NPTRNKLTSL SRKHFRHLDL SELILVGNPF TCSCDIMWIK TLQEAKSSPD TQDltYCLNES 180 

SKNIPLANLQ IPNCGLPSAN LAAPNLTVEE GKSITLSCSV AGDPVPNMYW DVGNLVSKHM 240 
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WO 02/086443 

NETSHTQGSL RITNISSDDS GKQISCVAEN LVGEDQDSVH 
WCIPFTVKGN PKPALQWFYN GAILNESKYI CTKIHVTKHT 
LIAKNBYGKD EKQISAHFMG WPGIDDGANP NYPDVIYEDY 
BVTDKTGREH LSVYAWVIA SWGPCIiLVM LPLLKLARHS 
VGPASVISND DDSASPLHHI SNGSNTPSSS EGGPDAVIIG 
KPDTFVQHIK RHNIVbKRGL GEGAFGKVPL ABCYNLCPBQ 
FHREAELLTN LQHEHIVKFY GVCVEGDPLI MVPEYMKHGD 
PTEI/TQSQML RIAQQIAAGM VYLASQHPVH RDLATRKCLV 
DYYRVGGHTM LPIRWMPPES IMYRKFTTES DVWSLGWLW 
CITQGRVLQR PRTCPQEVYE LMLGCWQREP HMRKNIKGIH 

Seg ID NO: 598 SNA sequence 
Nucleic Acid Accession »i AB052906 
Coding sequence: 74.. 814 



PCT/US02/12476 



LTVHFAPTIT 
EYHGCLQLDN 
GTAANDIGDT 
KFGMKDPSWF 
MTKIPVIENP 
DKILVAVKTL 
LNKFLRAHGP 
GENLLVKIGD 
EIFTYGKQPW 
TLLQNLAKAS 



FLESPTSDHH 
PTHMNKGDYT 
TNRSNEIPST 
GFGKVKSRQG 
QYFGITNSQL 
KDASDNARKD 
DAVLMAEGNP 
FGMSRDVYST 
YQLSNNEVIE 
PVYLDILG 



AAAACCTTGA 
CTCTGGGTCC 
GCTCCTGCTG 
CATCACCGTC 
GGATGAAAAG 
CCTGGGGAAG 
GGTGGTGGAC 
GGAACCCCTC 
TGGATCTTGG 
AATGTGGACA 
GGTTGTGGCC 
CTTCTTGATG 
CTCAGGCACA 
CATCCTCCCC 
AAGCTGATAC 
CCAGCTGCCC 
TGGACCCAAT 
TACCTAACAT 
TTCTGGCTGA 
GTACTTCTTT 
TAGACTTCAG 
ATAAGAAAAA 
TTTAAATAAA 



11 

I 

GGTGATTCAT 
TTAATGGCAG 
TCOGGCTGGT 
ATCCCTAAGT 
ACTTTTCTTC 
AAACTAAATG 
ATACTTACAG 
ACCCTGCAGG 
CAGTTCAGTT 
ACGGTTCATC 
ATGTCCTTCC 
GGCATGGACA 
ACCCAACTCA 
TGCTTCATCC 
CAAAAGGCTC 
ACGACCTACG 
AGCTCATTCA 
ATTATGCAAT 
CTAAACAAGA 
GAATGATGAT 
ACCTCTGGGG 
ATTTATATTA 
GAGTTCTATT 



21 
I 

CTTCCAGGCT 
CAGCCGCCGC 
CCCGGGCTGG 
TCAGACCTGG 
ACTATGACTG 
TCACAACGGC 
AGCAACTGCG 
CCAGGATGTC 
TCGATGGGCA 
CTGGAGCCAG 
ATTACTTCTC 
GCACCCTGGA 
GGGCCACAGC 
TCCCTGGCAT 
CTGTGAGCAC 
GTGTATGTCC 
CTGCCTTGAT 
TTTCTCTTGG 
TATATCATTT 
CTCTTTCTTG 
ATTCTTTCCG 
ATGATTGTTT 
TCCCAAAAAA 



31 
I 

CTCCTTCCAT 
TACCAAGATC 
GCGAGCCGAC 
ACCACGGTGG 
TGGCAACAAG 
CTGGAAAGCA 
TGACATTCAG 
TTGTGAGCAG 
GATCTTCCTC 
AAAGATGAAA 
AATGGGAGAC 
GCCAAGTGCA 
CACCACCCTC 
CTGAGGAGAG 
GGTCTTGATC 
AGTGGCCTCC 
TCCTTTTGCC 
TGCTACCTGA 
TCTTTCTTCT 
CAAATGATAT 
TGTCCTGAAA 
CCTTTAGTAA 
AAAAAAAAAA 



41 

I 

CAAGTCTCTC 
CTTCTGTGCC 
CCTCACTCTC 
TGTGCGGTTC 
ACAGTCACAC 
CAGAACCCAG 
CTGGAGAATT 
AAAGCTGAAG 
CTCTTTGACT 
GAAAAGTGGG 
TGTATAGGAT 
GGAGCACCAC 
ATCCTTTGCT 
TCCTTTAGAG 
AAACTCGCCC 
AGCAGATCAT 
AACAATTTTA 
TGGAATTCCT 
CTTTTTGTTT 
TGTCAGTAAA 
GAGAATTTTT 
TTTATTGTTC 
AA 



51 
I 

CTCCCTAGCG 
TCCGGCTTCT 
TTTGCTATGA 
AAGGCCAGGT 
CTGTCAGTCC 
TACTGAGAGA 
ACACACCCAA 
GACACAGCAG 
CAGAGAAGAG 
AGAATGACAA 
GGCTTGAGGA 
TCGCCATGTC 
GCCTCCTCAT 
TGACAGGTTA 
TTCTGTCTGG 
GATGACATCA 
CCAGCAGTTA 
GCACTTAAAG 
GGAAAATCAA 
ATAATCACGT 
AAATTATTTA 
TGTACTGATA 



Seq ID NO: 599 Protein sequence 
Protein Accession #i BAB61048.1 

11 



31 



41 



51 



1 11 21 

I I I I I I 

MAAAAATKIL LCLPLLLLLS GWSRAGRADP HSLCYDITVI PKFRPGPRWC AVQGQVDEKT 
FLHYDCGNKT VTPVSPLGKK LNVTTAWKAQ NPVLREWDI LTEQLRDIQL ENYTPKEPLT 
LQARMSCEQK AEGHSSGSWQ FSFDGQIFLL FDSEKRMWTT VHPGARKMKE KWENDKWAM 
SFHYFSMGDC IGWLEDFLMG MDSTLEPSAG APIAMSSGTT QLRATATTLI LCCLLIILPC 
FILPGI 

Seq ID NO: 600 DNA sequence 

Nucleic Acid Accession #: NM_001898.1 

Coding sequence: 57.. 4 82 



1 
I 

GGCTCTCACC 
CCCAGTATCT 
GCCCCAAGGA 
AGTGGGTACA 
ACTACTACAG 
ATTACTTCTT 
ACACCTGTGC 
TCTACGAAGT 
AGGGATCTGT 
CCACCCCTGG 
GACAGACAGA 
CTTCCTTCTT 
AAACAGTAGC 



11 
I 

CTCCTCTCCT 
GAGTACCCTG 
GGAGGATAGG 
GCGTGCCCTT 
ACGTCCGCTG 
CGACGTAGAG 
CTTCCATGAA 
TCCCTGGGAG 
GCCAGGCCAT 
ACTGGTGGCC 
GAAGGCTGCA 
GCTTCTAATA 
ATCGCC 



21 
I 

GCAGCTCCAG 
CTGCTCCTGC 
ATAATCCCGG 
CACTTCGCCA 
CGGGTACTAA 

gtgggccgca 
cagccagaac 
aacagaaggt 
tcgcaccagc 
cccacx:ctgc 

GGAGTCCTTT 
GCCCTGGTAC 



31 
I 

CTTTGTGCTC 
TGGCCACCCT 
GTGGCATCTA 
TCAGCGAGTA 
GAGCCAGGCA 
CCATATGTAC 
TGCAGAAGAA 
CCCTGGTGAA 
CACCACCCAC 
GGGAGGCCTC 
GTTGCTCAGC 
ATGGTACACA 



41 
I 

TGCCTCTGAG 
AGCTGTCGCC 
TAACGCAGAC 
TAACAAGGCC 
ACAGACCGTT 
CAAGTCCCAG 
ACAGTTGTGC 
ATCCAGGTGT 
TCCCACCCCC 
CCCATGTGCC 
AGGGCGCTCT 
CCCCCCCACC 



Seq ID NO: 601 Protein sequence 
Protein Accession fh NP_001889.1 



11 



21 



31 

I 



41 



51 



300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



60 
120 
180 
240 



51 
I 

GAGACCATGG 
CTGGCCTGGA 
CTCAATGATG 
ACCAAAGATG 
GGGGGGGTGA 
CCCAACTTGG 
TCTTTCGAGA 
CAAGAATCCT 
TGTAGTGCTC 
TGCGCCAAGA 
GCCCTCCCTC 
TCCTGCAATT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



MAQYLSTLIili LLATLAVALA WSPKEEDRII PGGIYNADLN DEWVQRALHF AISEYNKATK 
DDYYRRPLRV LRARQOTVGG VNYFFDVEVG RTICTKSQPN LDTCAFHEQP ELQKKQLCSF 
EIYEVPWENR RSLVKSRCQE S 

Seq ID NO: 602 DNA sequence 

Nucleic Acid Accession #: NMJ>03976.2 

Coding sequence: 299.961 



60 
120 



11 



21 



31 



41 



51 
I 



419 



WO 02/086443 

CTCTGAGCTT CTCTGAGCCT TGTTTGCTCA TCTGGAAAAA GGGGATTAAA CCATTTACCT 60 

CATGGAGTTG TGAAAQAATA GCTGCAAAGC ACCTAACACA TAGTAAGGTT CCCAGTGCAG 120 

CTACTTCTGC TGGGTTGAGT CTAGCTGTGT AGGCCCCTTG TTCCTCACCT GGAGAAACTG 180 

GGGTGGCAGG CCGGTCCCCC ACAAAAGATA ACTCATCTCT TAATTTGCAA GCTGCCTCAA 240 

CAGGAGGGTG GGGGAACAGC TCAACAATGG CTGATGGGCG CTCCTGGTGT TGATAGAGAT 300 

GGAACTTGGA CTTGGAGGCC TCTCCACGCT GTCCCACTGC CCCTGGCCTA GGOGGCAGCC 360 

TGCCCTGTGG CCCACCCTGG CCGCTCTGGC TCTGCTGAGC AGCGTCGCAG AGGCCTCCCT 420 

GGGCTCCGCG CCCCGCAGCC CTGCCCCCCG CGAAGGCCCC CCGCCTGTCC TGGCGTCCCC 4 BO 

CGCCGGCCAC CTGCCGGGGG GACGCACGGC CCGCTGGTGC AGTGGAAGAG CCCGGCGGCC 540 

GCCGCCGCAG CCTTCTCGGC CCGCGCCCCC GCCGCCTGCA CCCCCATCTG CTCTTCCCCQ 600 

CGGGGGCCGC GCGGCGCGGG CTGGGGGCCC GGGCAGCCGC GCTCGGGCAG CGGGGGCGCG 660 

GGGCTGCCGC CTGCGCTCGC AGCTGGTGCC GGTGCGCGCG CTCGGCCTGG GCCACCGCTC 720 

CGACGAGCTG GTGCGTTTCC GCTTCTGCAG CGGCTCCTGC CGCCGCGCGC GCTCTCCACA 780 

CX3ACCTCAGC CTGGCCAGCC TACTGGGCGC CGQGGCCCTG OGACCGCCCC CGGGCTCCCG 840 

GCCCGTCAGC CAGCCCTGCT GCCGACCCAC GCGCTACGAA GOGGTCTCCT TCATGGACGT 900 

CAACAGCACC TGGAGAACCG TGGACCGCCT CTCCGCCACC GCCTGCGGCT GCCTGGGCTG 960 

AGGGCTCGCT CCAGGGCTTT GCAGACTGGA CCCTTACCGG TGGCTCTTCC TGCCTGGGAC 1020 

CCTCCCGCAG AGTCCCACTA GCCAGCGGCC TCAGCCAGGG ACGAAGGCCT CAAAGCTGAG 1080 

AGGCCCCTAC CGGTGGGTGA TGGATATCAT CCCCGAACAG GTGAAGGGAC AACTGACTAG 1140 

CAGCCCCAGA GCCCTCACCC TGCGGATCCC AGCCTAAAAG ACACCAGAGA CCTCAGCTAT 1200 

GGAGCCCTTC GGACCCACTT CTCACAGACT CTGGCACTGG CCAGGCCTCG AACCTGGGAC 1260 

CCCTCCTCTG ATGAACACTA CAGTGGCTGA GGCATCAGCC CCCGCCCAGG CCCTGTAGGG 1320 

ACAGCATTTG AAGGACACAT ATTGCAGTTG CTTGGTTGAA AGTGCCTGTG CTGGAACTGG 1380 
CCTGTACTCA CTCATGGGAG CTGGCCCC 

Seq ID NO: 603 Protein sequence 

Protein Accession ft: NPJ>03967.1 . . 

1 11 21 31 41 51 

I I I I I I 

MELGLGGLST LSHCPWPRRQ PALWPTLAAL ALLSSVAEAS LGSAPRSPAP REGPPPVLAS 60 

PAGHLPGGRT ARWCSGRARR PPPQPSRPAP PPPAPPSALP RGGRAARAGG PGSRARAAGA 120 

RGCRLRSQLV FVRALGLGHR SDELVRFRPC SGSCRRARSP HDLSLASLLG AGALRPPPGS 180 
RPVSQPCCRP TRYEAVSFMD VNSTWRTVDR LSATACGCLG 

Seq ID NO: 604 DNA sequence 
Nucleic Acid Accession ft: NM_057091.1 
Coding sequence: 783.. 1445 

1 11 21 31 41 51 

1 I I I 1 1 

ACTGGCCGCT GAGAGAAGAA TCGGGTGGAG CAGAGAGCAG CTGCTGCAGG GCAGACAGCC 60 

GGACCCCCAA ATCTGCACGT ACCAGCAGTC AGCCGCCCCA CGCAGGGACC GGCTTACCCC 120 

TCGCTCCCCG CCCTCACTCA CTTTCTCCCG CCCTCGGCCC GGCCTCCCAG CTCTCTACTT 180 

CGCGTGTCTA CAAACTCAAC TCCCGGTTTC CGTGCCTCTC CACCGCTCGA GTTCTCTACT 240 

CTCCATATCC GAGGGGCCCC TCCCAGCATC TACCCCCCTC CCAACCTCGG GGGACCTAGC 300 

CAAGCTAGGG GGGACTGGAT CCGACGGGTG GAGCAGCCAG GTGAGCCCCG AAAGGTGGGG 360 

CGGGGCAGGG GCGCTCCCAG CCCCACCCCG GGATCTGGTG ACGCTGGGGC TGGAATTTGA 420 

CACCGGACGG CTGCGGCGGC GGGCAGGAGG CTGCTGAGGG ATGGAGTTGG GCCCGGCCCC 480 

CAGACAAGGC CCGGGGGCTC CGCCAGCAGC AGGTCCCTCG GGCCCCAGCC CTCGCTGCCA 540 

CCCGGGCCTG GAGCCCCACA CCCGAGGGTG CAGACTGGCT GCCAAGGCCA CACTTTTGGC 600 

TAAAAGAGGC ACTGCCAGGT GTACAGTCCT GGGCATGCGC TGTTTGAGCT TOGGGGGAGA 660 

GCCCAGCACT GGTCCCCGGA AAGGTGCCTA GAAGAACAAG GTGCAGGACC CCGTGCTGCC 720 

TCAACAGGAG GGTGGGGGAA CAGCTCAACA ATGGCTGATG GGCGCTCCTG GTGTTGATAG 780 

AGATGGAACT TGGACTTGGA GGCCTCTCCA CGCTGTCCCA CTGCCCCTGG CCTAGGCGGC 840 

AGCCTGCCCT GTGGCCCACC CTGGCCGCTC TGGCTCTGCT GAGCAGCGTC GCAGAGGCCT 900 

CCCTGGGCTC CGCGCCCCGC AGCCCTGCCC CCCGCGAAGG CCCCCCGCCT GTCCTGGCGT 960 

CCCCCGCCGG CCACCTGCCG GGGGGACGCA CGGCCCGCTG GTGCAGTGGA AGAGCCCGGC 1020 

GGCCGCCGCC GCAGCCTTCT CGGCCCGCGC CCCCGCCGCC TGCACCCCCA TCTGCTCTTC 1080 

CCCGCGGGGG CCGCGCGGCG CGGGCTGGGG GCCCGGGCAG CCGCGCTCGG GCAGCGGGGG 1140 

CGCGGGGCTG CCGCCTGCGC TCGCAGCTGG TGCCGGTGCG CGCGCTCGGC CTGGGCCACC 1200 

GCTCCGACGA GCTGGTGCGT TTCCGCTTCT GCAGCGGCTC CTGCCGCCGC GCGCGCTCTC 1260 

CACACGACCT CAGCCTGGCC AGCCTACTGG GCGCCGGGGC CCTGCGACCG CCCCCGGGCT 1320 

CCCGGCCCGT CAGCCAGCCC TGCTGCCGAC CCACGCGCTA CGAAGCGGTC TCCTTCATGG 13 80 

ACGTCAACAG CACCTGGAGA ACCGTGGACC GCCTCTCCGC CACCGCCTGC GGCTGCCTGG 1440 

GCTGAGGGCT CGCTCCAGGG CTTTGCAGAC TGGACCCTTA CCGGTGGCTC TTCCTGCCTG 1500 

GGACCCTCCC GCAGAGTCCC ACTAGCCAGC GGCCTCAGCC AGGGACGAAG GCCTCAAAGC 1560 

TGAGAGGCCC CTACCGGTGG GTGATGGATA TCATCCCCGA ACAGGTGAAG GGACAACTGA 1620 

CTAGCAGCCC CAGAGCCCTC ACCCTGCGGA TCCCAGCCTA AAAGACACCA GAGACCTCAG 1680 

CTATGGAGCC CTTCGGACCC ACTTCTCACA GACTCTGGCA CTGGCCAGGC CTCGAACCTG 1740 

GGACCCCTCC TCTGATGAAC ACTACAGTGG CTGAGGCATC AGCCCCCGCC CAGGCCCTGT 1800 

AGGGACAGCA TTTGAAGGAC ACATATTGCA GTTGCTTGGT TGAAAGTGCC TGTGCTGGAA 1860 
CTGGCCTGTA CTCACTCATG GGAGCTGGCC CC 

Seq ID NO: 605 Protein sequence 
Protein Accession ft: NPJ>03967.1 

1 11 21 31 41 51 

I I I I I I 

MELGLGGLST LSHCPWPRRQ PALWPTLAAL ALLSSVAEAS LGSAPRSPAP REGPPPVLAS 60 

PAGHLPGGRT ARWCSGRARR PPPQPSRPAP PPPAPPSALP RGGRAARAGG PGSRARAAGA 120 

RGCRLRSQLV PVRALGLGHR SDELVRFRPC SGSCRRARSP HDLSLASLLG AGALRPPPGS 180 
RPVSQPCCRP TRYEAVSFMD VNSTWRTVDR LSATACGCLG 



Seq ID NO: 606 DNA sequence 

Nucleic Acid Accession ft: NM J) 57 16 0.1 



420 



WO 02/086443 

Coding sequence i 1. .714 



PCT/US02/12476 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



ATGCCCGGCC 
CACCTGGGTG 
TGGCCCACCC 
GCGCCCCGCA 
CACCTGCCGG 
CAGCCTTCTC 
CGCGCGGCGC 
OGCCTGCGCT 
CTGGTGCGTT 
AGCCTGGCCA 
AGCCAGCCCT 
ACCTGGAGAA 
GCTCCAGGGC 
CAGAGTCCCA 
TACCGGTGGG 
AGAGCCCTCA 
TTCGGACCCA 
CTGATGAACA 
TTGAAGGACA 
TCACTCATGG 



11 
I 

TGATCTCAGC 
CCCTCTTTCT 
TGGCCGCTCT 
GCCCTGCCCC 
GGGGACGCAC 
GGCCCGCGCC 
GGGCTGGGGG 
CGCAGCTGGT 
TCCGCTTCTG 
GCCTACTGGG 
GCTGCCGACC 
COGTGGACCG 
TTTGCAGACT 
CTAGCCAGCG 
TGATGGATAT 
CCCTGCGGAT 
CTTCTCACAG 
CTACAGTGGC 
CATATTGCAG 
GAGCTGGCCC 



21 

I 

CCGAGGACAG 
CCCTGAGGCT 
GGCTCTGCTG 
CCGCGAAGGC 
GGCCOGCTGG 
CCCGCCGCCT 
CCCGGGCAGC 
GCCGGTGCGC 
CAGCGGCTCC 
CGCCGGGGCC 
CACGCGCTAC 
CCTCTCCGCC 
GGACCCTTAC 
GCCTCAGCCA 
CATCCCCGAA 
CCCAGCCTAA 
ACTCTGGCAC 
TGAGGCATCA 
TTGCTTGGTT 
C 



31 
I 

CCCCTCCTTG 
CCACTTGGTC 
AGCAGCGTCG 
CCCCCGCCTG 
TGCAGTGGAA 
GCACCCCCAT 
CGCGCTCGGG 
GCGCTCGGCC 
TGCCGCCGCG 
CTGCGACOGC 
GAAGCGGTCT 
ACCGCCTGCG 
CGGTGGCTCT 
GGGACGAAGG 
CAGGTGAAGG 
AAGACACCAG 
TGGCCAGGCC 
GCCCCCGCCC 
GAAAGTGCCT 



41 
I 

AGGTCCTTCC 
TCTCCGCGCA 
CAGAGGCCTC 
TCCTGGCGTC 
GAGCCCGGOG 
CTGCTCTTCC 
CAGOGGGGGC 
TGGGCCACCG 
CGCGCTCTCC 
CCCCGGGCTC 
CCTTCATGGA 
GCTGCCTGGG 
TCCTGCCTGG 
CCTCAAAGCT 
GACAACTGAC 
AGACCTCAGC 
TCGAACCTGG 
AGGCCCTGTA 
GTGCTGGAAC 



51 
I 

TCCCCAAGCC 
GCCTGCCCTG 
CCTGGGCTCC 
CCCCGCCGGC 
GCCGCCGCOG 
CCGCGGGGGC 
GCGGGGCTGC 
CTCCGACGAO 
ACACGACCTC 
CCGGCCCGTC 
CGTCAACAGC 
CTGAGGGCTC 
GACCCTCCCG 
GAGAGGCCCC 
TAGCAGCCCC 
TATGGAGCCC 
GACCCCTCCT 
GGGACAGCAT 
TGGCCTGTAC 



Seq ID NO j 607 Protein sequence 
Protein Accession tt: NP_476501.1 

1 11 



51 



21 31 41 

I I I I I I 

MPGLISARGQ PLLEVLPPQA HLGALFLPEA PLGLSAQPAL WPTLAALALIi SSVAEASLGS 
APRSPAPREG PPPVLASPAG HLPGGRTARW CSGRARRPPP QPSRPAPPPP APPSALPRGG 
RAARAGGPGS RARAAGARGC RLRSQLVPVR ALGLGHRSDE LVRFRFCSGS CRRARSPHDL 
SLASLLGAGA LRPPPGSRPV SQPCCRPTRY EAVSFMDVNS TWRTVDRLSA TACGCLG 

Seq ID NO: 608 DNA sequence 

Nucleic Acid Accession #: NM_057090.1 

Coding sequence: 29.. 715 



1 
I 

CTGATGGGCG 
GTCCCACTGC 
GTGGCCCACC 
CGCGCCCCGC 
CCACCTGCCG 
GCAGCCTTCT 
CCGCGCGGCG 
CCGCCTGCGC 
GCTGGTGCGT 
CAGCCTGGCC 
CAGCCAGCCC 
CACCTGGAGA 
CGCTCCAGGG 
GCAGAGTCCC 
CTACCGGTGG 
CAGAGCCCTC 
CTTCGGACCC 
TCTGATGAAC 
TTTGAAGGAC 
CTCACTCATG 



11 

I 

CTCCTGGTGT 
CCCTGGCCTA 
CTGGCCGCTC 
AGCCCTGCCC 
GGGGGACGCA 
CGGCCCGCGC 
CGGGCTGGGG 
TCGCAGCTGG 
TTCCGCTTCT 
AGCCTACTGG 
TGCTGCCGAC 
ACCGTGGACC 
CTTTGCAGAC 
ACTAGCCAGC 
GTGATGGATA 
ACCCTGCGGA 
ACTTCTCACA 
ACTACAGTGG 
ACATATTGCA 
GGAGCTGGCC 



21 
I 

TGATAGAGAT 
GGCGGCAGGC 
TGGCTCTGCT 
CCCGCGAAGG 
CGGCCCGCTG 
CCCCGCCGCC 
GCCCGGGCAG 
TGCCGGTGCG 
GCAGCGGCTC 
GCGCCGGGGC 
CCACGCGCTA 
GCCTCTCCGC 
TGGACCCTTA 
GGCCTCAGCC 
TCATCCCCGA 
TCCCAGCCTA 
GACTCTGGCA 
CTGAGGCATC 
GTTGCTTGGT 
CC 



31 
I 

GGAACTTGGA 
TCCACTTGGT 
GAGCAGCGTC 
CCCCCCGCCT 
GTGCAGTGGA 
TGCACCCCCA 
CCGCGCTCGG 
CGCGCTCGGG 
CTGCCGCCGC 
CCTGCGACCG 
CGAAGCGGTC 
CACCGCCTGC 
CCGGTGGCTC 
AGGGACGAAG 
ACAGGTGAAG 
AAAGACACCA 
CTGGCCAGGC 
AGCCCCCGCC 
TGAAAGTGCC 



41 
I 

CTTGGAGGCC 
CTCTCCGCGC 
GCAGAGGCCT 
GTCCTGGCGT 
AGAGCCCGGC 
TCTGCTCTTC 
GCAGCGGGGG 
CTGGGCCACC 
GCGCGCTCTC 
CCCCCGGGCT 
TCCTTCATGG 
GGCTGCCTGG 
TTCCTGCCTG 
GCCTCAAAGC 
GGACAACTGA 
GAGACCTCAG 
CTCGAACCTG 
CAGGCCCTGT 
TGTGCTGGAA 



51 
I 

TCTCCACGCT 
AGCCTGCCCT 
CCCTGGGCTC 
CCCCCGCCGG 
GGCCGCCGCC 
CCCGCGGGGG 
CGCGGGGCTG 
GCTCCGACGA 
CACACGACCT 
CCCGGCCCGT 
ACGTCAACAG 
GCTGAGGGCT 
GGACCCTCCC 
TGAGAGGCCC 
CTAGCAGCCC 
CTATGGAGCC 
GGACCCCTCC 
AGGGACAGCA 
CTGGCCTGTA 



Seq ID NO j 609 Protein sequence 
Protein Accession ft: NP_476431.1 



11 



21 



51 



31 41 

111 III 
MELGLGGLST LSHCPWPRRQ APLGLSAQPA LWPTLAALAL LSSVAEASLG SAPRSPAPRE 
GPPPVLASPA GHLPGGRTAR WCSGRARRPP PQPSRPAPPP PAPPSALPRG GRAARAGGPG 
SRARAAGARG CRLRSQLVPV RALGU5HRSD ELVRFRFCSG SCRRARSPHD LSLASLLGAG 
ALRPPPGSRP VSQPCCRPTR YEAVSFMDVN STWRTVDRLS ATACGCLG 

Seq ID NO: 610 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 1..1746 



1 
I 

ATGCCACTGA 
GCCTACCATG 
GGGGCACGCA 
CTCAACACGC 
GCCCTGAGGA 
GGCTCGCTGC 
TTCCAGGGCC 
CAGCCGGCCC 
CTGGAATACA 



11 
I 

AGCATTATCT 
GCTGCCCTAG 
TTGTGGOGGT 
ACATCACTGA 
TTGAGAAGAA 
GCTATCTCAG 
TGGACAGCCT 
ACTTCTCCCA 
TCCCTGACGG 



21 

I 

CCTTTTGCTG 
CGAGTGTACC 
GCCCACCCCT 
ACTCAATGAG 
TGAGCTGTCG 
CCTCGCCAAC 
TGAGTCTCTC 
GTGCAGCAAC 
AGCCTTCGAC 



31 
I 

GTGGGCTGCC 
TGCTCCAGGG 
CTGCCCTGGA 
TCCCCGTTCC 
CGCATCACGC 
AACAAGCTGC 
CTTCTGTCCA 
CTCAAGGAGC 
CACCTGGTAG 



41 
I 

AAGCCTGGGG 
CCTCCCAGGT 
ACGCCATGAG 
TCAATATCTC 
CTGGGGCCTT 
AGGTTCTGCC 
GTAACCAGCT 
TGCAGTTGCA 
GACTCACGAA 



51 
I 

TGCAGGGTTG 
GGAGTGCACC 
CCTGCAGATC 
AGCCCTCATC 
CCGAAACCTG 
CATCGGCCTC 
GTTGCAGATC 
CGGCAACCAC 
GCTCAATCTG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



60 
120 
180 



60 
120 
180 
240 
300 
360 
420 
480 
540 



421 



WO 02/086443 

GGCAAGAATA GCCTCACCCA CATCTCACCC AGGGTCTTCC AGCACCTGGG CAATCTCCAO 600 

GTCCTCCGGC TGTATGAGAA CAGGCTCACG GATATCCCCA TGGGCACTTT TGATGGGCTT 660 

GTTAACCTGC AGGAACTGGC TCTACAGCAG AACCAGATTG GACTGCTCTC CCCTGGTCTC 720 

TTCCACAACA ACCACAACCT CCAGAGACTC TACCTGTCCA ACAACCACAT CTCCCAGCTG 780 

CCACCCAGCA TCTTCATGCA GCTGCCCCAG CTCAACCGTC TTACTCTCTT TGGGAATTCC 840 

CTGAAGGAGC TCTCTCTGGG GATCTTCGGG CCCATGCCCA ACCTGCGGGA GCTTTGGCTC 900 

TATGACAACC ACATCTCTTC TCTACCCGAC AATGTCTTCA GCAACCTCCG CCAGTTGCAG 960 

GTCCTGATTC TTAGCCGCAA TCAGATCAGC TTCATCTCCC CGGGTGCCTT CAACGGGCTA 1020 

AOGGAGCTTC GGGAGCTGTC CCTCCACACC AACGCACTGC AGGACCTGGA CGGGAATGTC 1080 

TTCCGCATGT TGGCCAACCT GCAGAACATC TCCCTGCAGA ACAATCGCCT CAGACAGCTC 1140 

CCAGGGAATA TCTTCGCCAA CGTCAATGGC CTCATGGCCA TCCAGCTGCA GAACAACCAG 1200 

CTGGAGAACT TGCCCCTCGG CATCTTCGAT CACCTGGGGA AACTGTGTGA GCTGCGGCTG 1260 

TATGACAATC CCTGGAGGTG TGACTCAGAC ATCCTTCCGC TCCGCAACTG GCTCCTGCTC 1320 

AACCAGCCTA GGTTAGGGAC GGACACTGTA CCTGTGTGTT TCAGCCCAGC CAATGTCCGA 1380 

GGCCAGTCCC TCATTATCAT CAATGTCAAC GTTGCTGTTC CAAGOGTCCA TGTCCCTGAG 1440 

GTGCCTAGTT ACCCAGAAAC ACCATGGTAC CCAGACACAC CCAGTTACCC TGACACCACA 1500 

TCCGTCTCTT CTACCACTGA GCTAACCAGC CCTGTGGAAG ACTACACTGA TCTGACTACC 1560 

ATTCAGGTCA CTGATGACCG CAGCGTTTGG GGCATGACCC AGGCCCAGAG CGGGCTGGCC 1620 

ATTGCCGCCA TTGTAATTGG CATTGTCGCC CTGGCCTGCT CCCTGGCTGC CTGCGTCGGC 1680 

TGTTGCTGCT GCAAGAAGAG GAGCCAAGCT GTCCTGATGC AGATGAAGGC ACCCAATGAG 1740 

TGTTAAAGAG GCAGGCTGGA GCAGGGCTGG GGAATGATGG GACTGGAGGA CCTGGGAATT 1800 

TCATCTTTCT GCCTCCACCC CTGGGTCCAT GGAGCTTTCC CGTGATTGCT CTTTCTGGCC 1860 

CTAGATAAAG GTGTGCCTAC CTCTTCCTGA CTTGCCTGAT TCTCCCGTAG AGAAGCAGGT 1920 

CGTGCCGGAC CTTCCTACAA TCAGGAAGAT AGATCCAACT GGCCATGGCA AAAGCCCTGG 1980 

GGATTTCCGA TTCATACCCC TGGGCTTCCT TCGAGAGGGC TCTTCCTCCA AATCCTCCCC 2040 

ACCTGTCCTC CAAGAACAGC CTTCCCTGCG CCCAGGCCCC CTCCGGGCCT CTGTAGACTC 2100 

AGTTAGTCCA CAGCCTGCTC ACTTCGTGGG AATAGTTCTC CGCTGAGATA GCCCCTCTCG 2160 

CCTAAGTATT ATGTAAGTTG ATTTCCCTTC TTTTGTTTCT CTTGTTTGTG CTATGGCTTG" 2220 

ACCCAGCATG TCCCCTCAAA TGAAAGTTCT CCCCTTGATT TTCTGCTCCT GAAGGCAGGG 2280 

TGAGTTCTCT CCTCAAAGAA GACTTCAAAC CATTTAACTG GTTTCTTAAG AGCCGTCAAT 2340 

CAGCCTGGTT TTGGGGATGC TATGAAAGAG AGAAGGAAAA TCATGCCGCT CAGTTCCTGG 2400 

AGACAGAAGA GCCGTCATCA GTGTCTCACT TGTGATTTTT ATCTGGAAAA GGAAGAAACA 2460 

CCCCAGCACA GCAAGCTCAG CCTTTTAGAG AAGGATATTT CCAAACTGCA AACTTTGCTT 2520 

TGAAAAGTTT AGCCCTTTAA GGAATGAAAT CATGTAGAAT TTTGGACTTC TAAAAACATT 2580 

AAAATCAGCT TATTAATACG GGATAGAGAA AGAAATCTGG TGCCTGGGGG TCCCTGTGTT 2640 

CACCCCTAGA GTTTGTTTTA AAATTTTTAA TTGAAGCATG TGAAGTGTAC STG CAGAAAA 2700 

GTGGGAACAT GATAGTGTAT GGCTTGGTGG ATTTTCACAA ACTGAACATA CCTGTGTAAT 2760 

CAGCATCTAG ACCCAGACCC AGAGCATCAC AAATATCCCC CATCCTGGGC TTTTCCCAGA 2820 

GGAGATGGGG GCTTCTGAAG ATGGACTTAC CTGGGACCTG CCCCCCATGA GCCAGGACGG 2880 

TCCCCCCACA GTCAGCCTGT GCAAAGGCCC CGTGGCCAGG GGTGGAGGAG AATATGTGGG 2940 

TGTGGACAGG ATGGGAGACT GTGGCCTGAA CAGGAGATTT TATTATATCT GGAGACCCTG 3000 

AGAGACCCTG AGACCTGGGG CACCATGGCT GGCCAGGTCA GAAGCATCCT GACTGCAGAG 3060 

GTCCGTGCAG CCACACCCTC TTCCCTGCCA GCAAGTTGTC TGCGGCTCAT CGGAGGCCCC 3120 

TCCGCCTGGA GCCTTCTATG GACGTGATAT GCCTGTATCT GTTTTTAATT TTCATTCTTC 3180 

ACTTAGGGGA AGTGAAATCG CTCAGAGATG AGATCCTTTA ATTGAAAACG AAGTGTAACG 3240 

GAATCTAGTG TCTTTCTAAT GTGGTAAAAT TCTCCATCAA CATCACAGTC AGCTGGCAGC 3300 

TGAACTTCAG AATCTCACTT ACAGCAGGCG ACACGGGGGT ACACCGATGG GTCACACTGG 3360 

GTCTGGGGGC TCCCTGGAGC TCCTCCTGCG TGTGGTCTGG TTAGGAGTTG AGTTGTTTGC 3420 

TCCAGGGTTA TTCTCCTCCT CGAGTCACAG TCACACGAAT ACCTGCCTTC TCTGGCTTTC 3480 

CTGCTATACA CATATTCACA TGGCGCTCAA GAAGTTAGGC TCATGGCAAC GTGTGTCTTT 3540 

CTCTGGACAA CTGGCCCAGT TTACAGTGAA ATGGAGAATT TCAGGTCTCC ACGTCTGCCC 3600 

AGGAAAGAAC TTCAGCTGAC TCCACGGGGA TCTGGAAATC CACGACCAAT CCCGATCGGC 3660 

TCTTATTAGC TCCCCGCTCC ACAAGACACC TGTGCTTTGG AAATCCACCA CCAATCCCGA 3720 

TCGGCTCTTA TTAGCTCCCC GCTCCACAAG ACACCTGTGA TCTGGAAATC TACCACCAAT 3780 

CCCGATCGGC TCTTATTAGC TCCCCGCTCC ACAAGACACC TGTGACATCC TCCAGGGCCA 3840 

CAGGAGCACG TGCTGACCAG TTTTCCCTTC CAGTTCCTGC ACAAAAAGTG TCCAGAGGGC 3900 

TGTTTGCAAA CACTAGTGCA CTTTGTAGCT TTTCACCCTC TGTCCCAGGG AATCTAGGAG 3960 

AGATGAGGCC CGTCAGAGTC AAGAGATGTC ATCCCCCCAG GGTCTCCAAG GCATTTCCAC 4020 

ACTATTGGTG GCACCTGGAG GACATGCACC AAGGCTTGCC AGAGCCAACA GGAAGTGAGC 4080 

CCAGAGCATG GCACATGAGC ATCACCCGCT GATGGTGGCC TGCTGTGCCT GGTGCCAACA 4140 

GGGGCATCCC GGCCCGTACC CCTCCAGACA GGAAGCATGG GTTTGCCCAC AGACCTGTCG 4200 

GGTGCTCCTG TGAGTGGCCT CCAGATGTCT TTGTGCATAG GCACAAGTGG GCCAGGGCTG 4260 

GAGGGAGGTG GGAAACCTCA TCATCCGGTG GGCCCTGCCA ATCTTAACCC AGAACCCTTA 4320 

GGTATTCCTG GCAGTAGCCA TGACATTGGA GCACCTTCCT CTCCAGCCAG AGGCTGACCT 4380 

GAGGGCCACT GTCCTCAGAT GACACCACCC AGGAGCACCC TAGGTGAGGG GTGAGGGCCC 4440 

CCTTATGTGA ACCTCTTGCC TCTTCCTTTC TCCCATCAGA GTGGTTGGAT GG AGCCATTG 4500 

GCCTCCTTTT CTTCAGCGGG CCCTTCAACC TCTCTGCACC ATGTTGTCTG GCTGAGGAGC 4560 

TACTAGAAAA GCTGAGTGGA GTCTCCTTTC CAACAGGATG ATGCATTTGC TCAATTCTCA 4620 

GGGCTGGAAT GAGCCGGCTG GTCCCCCAGA AAGCTGGAGT GGGGTACAGA GTTCAGTTTT 4680 

CCTCTCTGTT TACAGCTCCT TGACAGTCCC ACGCCCATCT GGAGTGGGAG CTGGGAGTTA 4740 

GTGTTGGAGA AGAAACAACA AAAGCCAATT AGAACCACTA TTTTTAAAAA GTGCTTACTG 4800 

TGCACAGATA CTCTTCAAGC ACTGGACGTG GATTCTCTCT CTAGCCCTCA GCACCCCTGC 4860 

GGTAGGAGTG CCGCCTCTAC CCACTTGTGA TGGGGTACAG AGGCACTTGC TCTTCTGCAT 4920 

GGTGTTCAAT AGGCTGGGAG TTTTATTTAT CTCTTCAAAC TTTGTACAAG AGCTCATGGC 4980 

TTGTCTTGGG CTTTCGTCAT TAAACCAAAG GAAATGGAAG CCATTCCCCT GTTGCTCTCC 5040 

TTAGTCTTGG TCATCAGAAC CTCACTTGGT ACCATATAGA TCAAAAGCTT TGTAACCACA 5100 

GGAAAAAATA AACTCTTCCA TCCCTTAAAG AATAGAATAG TTTGTCCCTC TCATGGGAAT 5160 

TGGGCTGTAT GTATATTGTT CTTCCTCCTT AGAATTTAGA GATACAAGAG TTCTACTTAG 5220 

AACTTTTCAT GGACACAATT TCCACAACCT TTCAGATGCT GATGTAGAGC TATTGGGAAA 5280 

GAACTTCCAA ACTCAGGAAG TTTGCAGAGA GCAGACAGCT AGAGATAACT CGGGACCCAG 5340 

AGTTGGTCGA CAGATGTTAG ATGTATCCTA GCTTTTAGCC ATAAACCACT CAAAGATTCA 5400 

GCCCCCAGAT CCCACAGTCA GAACTGAATC TGCGTTGTTG GGAAGCCAGC AGTGGCCTTG 5460 

GGAAGGAAGC CATGGCTGTG GTTCAGAGAG GGTGGGCTGG CAAGCCACTT CCGGGGAAAA 5520 

CTCCTTCCGC CCCAGGTTTC TTCTTCTCTT AAGGAGAGAT TGTTCTCACC AACCCGCTGC 5580 

CTTCATGCTG CCTTCAAAGC TAGATCATGT TTGCCTTGCT TAGAGAATTA CTGCAAATCA S640 

GCCCCAGTGC TTGGCGATGC ATTTACAGAT TTCTAGGCCC TCAGGGTTTT GTAGAGTGTG 5700 

AGCCCTGGTG GGCAGGGTTG GGGGGTCTGT CTTCTGCTGG ATGCTGCTTG TAATCCATTT 5760 
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GGTGTACAGA ATCAACAATA AATAATATAC ATGTAT 

Seq ZD NOt 611 Protein sequence 
Protein Accession fh BAB84S87.1 

1 U 21 31 41 51 

I I I I I I 

MPLKHYLLLL VGCQAWGAGL AYHGCPSECT CSRASQVECT GARIVAVPTP LPWNAMSLQI 60 
LNTHITELNE SPFLNISALI ALRIEKNELS RITPGAFRNL GSLRYLSLAN NKLQVLPIGL 120 
FQGLDSLESL LLSSNQLLQI QPAHFSQCSN LKELQLHGNH LEYIPDGAFD HLVGLTKLNL 180 
GKNSLTHISP RVFQHLGNLQ VLRLYENRLT DIPMGTFDGL VNLQELALQQ NQIGLLSPGL 240 
FHNNHNLQRL YLSNNHISQL PPSIFMQLPQ LNRLTLFGNS LKELSLGIFG PMPNLRELWL 300 
YDNHISSLPD NVFSNLRQLQ VLILSRKQIS FISPGAFNGI* TELRELSLHT NALQDLDGNV 360 
FRMLANLQNI SLQNNRLRQL PGNIFANVNG LMAIQLQNNQ LENLPLGIFD HLGKLCELRL 420 
YDNPWRCDSD ILPLRNWLLL NQPRLGTDTV PVCFSPANVR GQSLIIINVN VAVPSVHVPE 480 
VPSYPETPWY PDTPSYPDTT SVSSTTELTS PVEDYTDLTT IQVTDDRSVW GMTQAQSGLA 540 
IAAIVIGIVA LACSLAACVG CCCCKKRSQA VLMQMKAPNE C 

Seq ID NO: 612 DNA sequence 
Nucleic Acid Accession 8: XMJ> 98151 
Coding sequence: 1..447 

1 11 21 31 

I I I I 

ATGATGCATT TGCTCAATTC TCAGGGCTGG AATGAGCCGG 
AGTGGGGTAC AGAGTTCAGT TTTCCTCTCT GTTTACAGCT 
TCTGGAGTGG GAGCTGGGAG TCAGTGTTGG AGAAGAAACA 
CTATTTTTAA AAAGTGCTTA CTGTGCACAG ATACTCTTCA 
TCTCTAGCCC TCAGCACCCC TGCGGTAGGA GTGCCGCCTC 
CAGAGGCACT TGCTCTTCTG CATGGTGTTC AATAGGCTGG 
AACTTTGTAC AAGAGCTCAT GGCTTGTCTT GGGCTTTCGT 
AAGCCATTCC CCTGTTGCTC TCCTTAG 

Seq ID NO: 613 Protein sequence 
Protein Accession XPJ)981S1 

1 11 21 31 41 51 

I I I I I I 

MMHLliNSQGW NEPAGPPESW SGVQSSVFLS VYSSLTVPRP SGVGAGSQCW RRNNKSQLEP 60 
LFLKSAYCAQ ILFKHWTWIL SLALSTPAVG VPPLPTCDGV QRHLLFCMVF NRLGVLFISS 120 
NFVQELMACL GLSSLNQRKW KPFPCCSP 

Seq ID NO: 614 DNA sequence 

Nucleic Acid Accession #: NMJ) 02 658.1 

Coding sequence: 77.. 1372 

1 11 21 31 41 51 

I I I I I I 

GTCCCCGCAG CGCCGTCGCG CCCTCCTGCC GCAGGCCACC GAGGCCGCCG CCGTCTAGCG 60 

CCCCGACCTC GCCACCATGA GAGCCCTGCT GGCGCGCCTG CTTCTCTGCG TCCTGGTCGT 120 

GAGCGACTCC AAAGGCAGCA ATGAACTTCA TCAAGTTCCA TCGAACTGTG ACTGTCTAAA 180 

TGGAGGAACA TGTGTGTCCA ACAAGTACTT CTCCAACATT CACTGGTGCA ACTGCCCAAA 240 

GAAATTCGGA GGGCAGCACT GTGAAATAGA TAAGTCAAAA ACCTGCTATG AGGGGAATGG 300 

TCACTTTTAC CGAGGAAAGG CCAGCACTGA CACCATGGGC CGGCCCTGCC TGCCCTGGAA 360 

CTCTGCCACT GTCCTTCAGC AAACGTACCA TGCCCACAGA TCTGATGCTC TTCAGCTGGG 420 

CCTGGGGAAA CATAATTACT GCAGGAACCC AGACAACCGG AGGCGACCCT GGTGCTATGT 480 

GCAGGTGGGC CTAAAGCCGC TTGTCCAAGA GTGCATGGTG CATGACTGCG CAGATGGAAA 540 

AAAGCCCTCC TCTCCTCCAG AAGAATTAAA ATTTCAGTGT GGCCAAAAGA CTCTGAGGCC 600 

CCGCTTTAAG ATTATTGGGG GAGAATTCAC CACCATCGAG AACCAGCCCT GGTTTGCGGC 660 

CATCTACAGG AGGCACCGGG GGGGCTCTGT CACCTACGTG TGTGGAGGCA GCCTCATCAG 720 

CCCTTGCTGG GTGATCAGCG CCACACACTG CTTCATTGAT TACCCAAAGA AGGAGGACTA 780 

CATCGTCTAC CTGGGTCGCT CAAGGCTTAA CTCCAACACG CAAGGGGAGA TGAAGTTTGA 840 

GGTGGAAAAC CTCATCCTAC ACAAGGACTA CAGCGCTGAC ACGCTTGCTC ACCACAACGA 900 

CATTGCCTTG CTGAAGATCC GTTCCAAGGA GGGCAGGTGT GCGCAGCCAT CCCGGACTAT 960 

ACAGACCATC TGCCTGCCCT CGATGTATAA CGATCCCCAG TTTGGCACAA GCTGTGAGAT 1020 

CACTGGCTTT GGAAAAGAGA ATTCTACCGA CTATCTCTAT CCGGAGCAGC TGAAAATGAC 1080 

TGTTGTGAAG CTGATTTCCC ACCGGGAGTG TCAGCAGCCC CACTACTACG GCTCTGAAGT 1140 

CACCACCAAA ATGCTATGTG CTGCTGACCC CCAATGGAAA ACAGATTCCT GCCAGGGAGA 1200 

CTCAGGGGGA CCCCTCGTCT GTTCCCTCCA AGGCCGCATG ACTTTGACTG GAATTGTGAG 1260 

CTGGGGCCGT GGATGTGCCC TGAAGGACAA GCCAGGCGTC TACACGAGAG TCTCACACTT 1320 

CTTACCCTGG ATCCGCAGTC ACACCAAGGA AGAGAATGGC CTGGCCCTCT GAGGGTCCCC 1380 

AGGGAGGAAA CGGGCACCAC CCGCTTTCTT GCTGGTTGTC ATTTTTGCAG TAGAGTCATC 1440 

TCCATCAGCT GTAAGAAGAG ACTGGGAAGA TAGGCTCTGC ACAGATGGAT TTGCCTGTGG 1500 

CACCACCAGG GTGAACGACA ATAGCTTTAC CCTCACGGAT AGGCCTGGGT GCTGGCTGCC 1560 

CAGACCCTCT GGCCAGGATG GAGGGGTGGT CCTGACTCAA CATGTTACTG ACCAGCAACT 1620 

TGTCTTTTTC TGGACTGAAG CCTGCAGGAG TTAAAAAGGG CAGGGCATCT CCTGTGCATG 1680 

GGCTCGAAGG GAGAGCCAGC TCCCCCGACC GGTGGGCATT TGTGAGGCCC ATGGTTGAGA 1740 

AATGAATAAT TTCCCAATTA GGAAGTGTAA GCAGCTGAGG TCTCTTGAGG GAGCTTAGCC 1800 

AATGTGGGAG CAGCGGTTTG GGGAGCAGAG ACACTAACGA CTTCAGGGCA GGGCTCTGAT I860 

ATTCCATGAA TGTATCAGGA AATATATATG TGTGTGTATG TTTGCACACT TGTTGTGTGG 1920 

GCTGTGAGTG TAAGTGTGAG TAAGAGCTGG TGTCTGATTG TTAAGTCTAA ATATTTCCTT 1980 

AAACTGTGTG GACTGTGATG CCACACAGAG TGGTCTTTCT GGAGAGGTTA TAGGTCACTC 2040 

CTGGGGCCTC TTGGGTCCCC CACGTGACAG TGCCTGGGAA TGTACTTATT CTGCAGCATG 2100 

ACCTGTGACC AGCACTGTCT CAGTTTCACT TTCACATAGA TGTCCCTTTC TTGGCCAGTT 2160 

ATCCCTTCCT TTTAGCCTAG TTCATCCAAT CCTCACTGGG TGGGGTGAGG ACCACTCCTT 2220 

ACACTGAATA TTTATATTTC ACTATTTTTA TTTATATTTT TGTAATTTTA AATAAAAGTG 2280 
ATCAATAAAA TGTGATTTTT CTGA 



41 51 
I I 

CTGGTCCCCC AGAAAGCTGG 60 

CCTTGACAGT CCCACGCCCA 120 

ACAAAAGCCA ATTAGAACCA 180 

AGCACTGGAC GTGGATTCTC 240 

TACCCACTTG TGATGGGGTA 300 

GAGTTTTATT TATCTCTTCA 360 

CATTAAACCA AAGGAAATGG 420 
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Seq ID NO: 615 Protein sequence 
Protein Accession #: NP_002649.1 

1 11 21 31 41 51 

I I I I I I 

MRALLARLLL CVLWSDSKG SNELHQVPSN CDCLNGGTCV SNKYFSNIHW CNCPKXFGGQ 60 

HCEIDKSKTC YEGNGHFYRG KASTDTMGRP CLPWNSATVL QQTYHAHRSD ALQLGLGKHN 120 

YCRNPDNRRR PWCYVQVGLK PLVQECMVHD CADGKKPSSP PEELKFQCGQ KTLRPRFKII 180 

GGEFTTIENQ PWFAAIYRRH RGGSVTYVCG GSLISPCWVI SATHCFIDYP KKEDYIVYLG 240 

RSRLNSNTQG EMKFEVENLI LHKDYSADTL AHHNDIALLK IRSKEGRCAQ PSRTIQT I CL 300 

PSMYNDPQFG TSCEITGFGK ENSTDYLYPE QLKMTWKLI SHRECQQPHY YGSEVTTKML 360 

CAADPQWKTD SCQGDSGGPL VCSLQGRMTL TGIVSWGRGC ALKDKPGVYT RVSHFLPWIR 420 
SHTKEENGLA L 

Seq ID NO: 616 DNA sequence 

Nucleic Acid Accession #: NMJ524422.1 

Coding sequence: 202.. 2907 

1 11 21 31 41 51 

| | I I I I 

CGCCAAAGGA AAAGCCCCTT GGATGAGAGG CAGGCGCTTC AGAGAAGCTA AGAAAAGCAC 60 

CTCTCCGCGC GCCCCACCTC CTCCGCCTCG CGCTCCTCCT GAGCAGCGGG CCCAGACTGC 120 

GCTCCGGCCG CGGCCCTCGC CCCGCGGAGC CCTCCTACCC CGGCCCGACG CTCGGCCCGC 180 

GACCTGCCCC GAGCCCTCTC CATGGAGGCA GCCCGCCCCT CCGGCTCCTG GAACGGAGCC 240 

CTCTGCCGGC TGCTCCTGCT GACCCTCGCG ATCTTAATAT TTGCCAGTGA TGCCTGCAAA 300 

AATGTGACAT TACATGTTCC CTCCAAACTA- GATGCCGAGA AACTTGTTGG TAGAGTTAAC 360 

CTGAAAGAGT GCTTTACAGC TGCAAATCTA ATTCATTCAA GTGATCCTGA CTTCCAAATT 420 

TTGGAGGATG GTTCAGTCTA TACAACAAAT ACTATTCTAT TGTCCTCGGA GAAGAGAAGT 480 

TTTACCATAT TACTTTCCAA CACTGAGAAC CAAGAAAAGA AGAAAATATT TGTCTTTTTG 540 

GAGCATCAAA CAAAGGTCCT AAAGAAAAGA CATACTAAAG AAAAAGTTCT AAGGCGCGCC 600 

AAGAGAAGAT GGGCTCCAAT TCCTTGTTCG ATGCTAGAAA ACTCCTTGGG TCCTTTTCCA 660 

CTTTTCCTTC AACAGGTTCA ATCTGACACG GCCCAAAACT ATACCATATA CTATTCCATA 720 

AGAGGTCCTG GAGTTGACCA AGAACCTCGG AATTTATTTT ATGTGGAGAG AGACACTGGA 780 

AACTTGTATT GTACTCGTCC TGTAGATCGT GAGCAGTATG AATCTTTTGA GATAATTGCC 840 

TTTGCAACAA CTCCAGATGG GTATACTCCA GAACTTCCAC TGCCCCTAAT AATCAAAATA 900 

GAGGATGAAA ATGATAACTA CCCAATTTTT ACAGAAGAAA CTTATACTTT TACAATTTTT 960 

GAAAATTGCA GAGTGGGCAC TACTGTGGGA CAAGTGTGTG CTACTGACAA AGATGAGCCT 1020 

GACACGATGC ACACACGCCT GAAGTACTCC ATCATTGGGC AGGTGCCACC ATCACCCACC 1080 

CTATTTTCTA TGCATCCAAC TACAGGCGTG ATCACCACAA CATCATCTCA GCTAGACAGA 1140 

GAGTTAATTG ACAAGTACCA GTTGAAAATA AAAGTACAAG ACATGGATGG TCAGTATTTT 1200 

GGTCTACAGA CAACTTCAAC TTGTATCATT AACATTGATG ATGTAAATGA CCACTTGCCA 1260 

ACATTTACTC GTACTTCTTA TGTGACATCA GTGGAAGAAA ATACAGTTGA TGTGGAAATC 1320 

TTACGAGTTA CTGTTGAGGA TAAGGACTTA GTGAATACTG CTAACTGGAG AGCTAATTAT 1380 

ACCATTTTAA AGGGCAATGA AAATGGCAAT TTTAAAATTG TAACAGATGC CAAAACCAAT 1440 

GAAGGAGTTC TTTGTGTAGT TAAGCCTTTG AATTATGAAG AAAAGCAACA GATGATCTTG 1500 

CAAATTGGTG TAGTTAATGA AGCTCCATTT TCCAGAGAGG CTAGTCCAAG ATCAGCCATG 1560 

AGCACAGCAA CAGTTACTGT TAATGTAGAA GATCAGGATG AGGGCCCTGA GTGTAACCCT 1620 

CCAATACAGA CTGTTCGCAT GAAAGAAAAT GCAGAAGTGG GAACAACAAG CAATGGATAT 1680 

AAAGCATATG ACCCAGAAAC AAGAAGTAGC AGTGGCATAA GGTATAAGAA ATTAACTGAT 1740 

CCAACAGGGT GGGTCACCAT TGATGAAAAT ACAGGATCAA TCAAAGTTTT CAGAAGCCTG 1800 

GATAGAGAGG CAGAGACCAT CAAAAATGGC ATATATAATA TTACAGTCCT TGCATCAGAC 1860 

CAAGGAGGGA GAACATGTAC GGGGACACTG GGCATTATAC TTCAAGACGT GAATGATAAC 1920 

AGCCCATTCA TACCTAAAAA GACAGTGATC ATCTGCAAAC CCACCATGTC ATCTGCGGAG 1980 

ATTGTTGCGG TTGATCCTGA TGAGCCTATC CATGGCCCAC CCTTTGACTT TAGTCTGGAG 2040 

AGTTCTACTT CAGAAGTACA GAGAATGTGG AGACTGAAAG CAATTAATGA TACAGCAGCA 2100 

CGTCTTTCCT ATCAGAATGA TCCTCCATTT GGCTCATATG TAGTACCTAT AACAGTGAGA 2160 

GATAGACTTG GCATGTCTAG TGTCACTTCA TTGGATGTTA CACTGTGTGA CTGCATTACC 2220 

GAAAATGACT GCACACATCG TGTAGATCCA AGGATTGGCG GTGGAGGAGT ACAACTTGGA 2280 

AAGTGGGCCA TCCTTGCAAT ATTGTTGGGC ATAGCATTGC TCTTTTGCAT CCTGTTTACG 2340 

CTGGTCTGTG GGGCTTCTGG GACGTCTAAA CAACCAAAAG TAATTCCTGA TGATTTAGCC 2400 

CAGCAGAACC TAATTGTATC AAACACAGAA GCTCCTGGAG ATGACAAAGT GTATTCTGCG 2460 

AATGGCTTCA CAACCCAAAC TGTGGGCGCT TCTGCTCAGG GAGTTTGTGG CACCGTGGGA 2520 

TCAGGAATCA AAAACGGAGG TCAGGAGACC ATCGAAATGG TGAAAGGAGG ACACCAGACC 2580 

TCGGAATCCT GCCGGGGGGC TGGCCACCAT CACACCCTGG ACTCCTGCAG GGGAGGACAC 2640 

AOGGAGGTGG ACAACTGCAG ATACACTTAC TCGGAGTGGC ACAGTTTTAC TCAGCCCCGT 2700 

CTTGGTGAAA AAGTGTATCT GTGTAATCAA GATGAAAATC ACAAGCATGC CCAAGACTAT 2760 

GTCCTGACAT ATAACTATGA AGGAAGAGGA TCGGTGGCTG GGTCTGTAGG TTGTTGCAGT 2820 

GAACGACAAG AAGAAGATGG GCTTGAATTT TTGGATAATT TGGAGCCCAA ATTTAGGACA 2880 

CTAGCAGAAG CATGCATGAA GAGATGAGTG TGTTCTAATA AGTCTCTGAA AGCCAGTGGC 2 940 

TTTATGACTT TTAAAAAAAA TTACAAACCA AGAATTTTTT AAAGCAGAAG ATGCTATTTG 3000 

TGGGGGTTTT TCTCTCATTA TTTGGATGGA ATCTCTTTGG TCAAATGCAC ATTTACAGAG 3060 

AGACACTATA AACAAGTACA CAAATTTTTC AATTTTTACA TATTTTTAAA TTACTTATCT 3120 

TCTATCCAAG GAGGTCTACA GAGAAATTAA AGTCTGCCTT ATTTGTTACA TTTGGGTATA 3180 

ATGACAACAG CCAATTTATA GTGCAATAAA ATGTAATTAA TTCAAGTCCT TATTATAGAC 3240 

TATTTGAAGC ACAACCTAAT GGAAAATTGT AGAGACCTTG CTTTAACATT ATCTCCAGTT 3300 

AATTAAGTGT TCATGTGGTG CTTGGAAACT GTTGTTTTCC TGAACATCTA AAGTGTGTAG 3360 

ACTGCATTCT TGCTATTATT TTATTCTTGT AATGTGACCT TTTCACTGTG CAAAGGGAGA 3420 
TTTCTAGCCA GGCATTGACT ATTACAATTT CATT 

Seq ID NO: 617 Protein sequence 
Protein Accession 8: NP_077740.l 

1 11 21 31 41 51 

I I I I I I 

MEAARPSGSW NGALCRLLLL TLAILIFASD ACKNVTLHVP SKLDAEKLVG RVNLKECFTA 60 

ANLIHSSDPD FQILEDGSVY TTNTILLSSE KRSFTILLSN TENQEKKKIF VFLEHQTKVL 120 
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KKRHTKEKVL RRAKRRWAPI PCSMLENSUG PFPLFLQQVQ SDTAQNYTIY YSIRGPGVDQ 180 

EPRNLFYVER DTGNLYCTRP VDREQYESFE I IAPATTPDG YTPELPLPLI IKIEDENDNY 240 

PIFTEETYTF TIFENCRVGT TVGQVCATDK DEPDTMHTRL KYSIIGQVPP SPTLFSMHPT 300 

TGVITTTSSQ LDRELIDKYQ LKIKVQDMDQ QYFGLQTTST CIINIDDVND HLPTFTRTSY 360 

VTSVEENTVD VEILRVTVED KDLVNTANWR ANYTILKGNE NGNFKIVTDA KTNEGVLCW 420 

KPLNYEEKQQ MILQIGWNE APFSREASPR SAMSTATVTV NVEDQDEGPE CNPPIQTVRM 480 

KENAEVGTTS NGYKAYDPET RSSSGIRYKK LTDPTGWVTI DENTGSIKVF RSLDREABTI 540 

KNGIYNITVL ASDQGGRTCT GTLGI ILQDV NDNSPFIPKK TVIICKPTMS SAEIVAVDPD 600 

EPIHGPPFDF SLESSTSEVQ RMWRLKAIND TAARLSYQND PPFGSYWPI TVRDRLGMSS 660 

VTSLDVTLCD CTTENDCTHR VDPRIGGGGV QLGKWAILAI LLGIALLFCI LFTLVCGASG 720 

TSKQPKVIPD DLAQQNLIVS NTEAPGDDKV YSANGFTTQT VGASAQGVCG TVGSGIKNGG 780 

QETIEMVKGG HQTSSSCRGA GHHHTLDSCR GGHTEVDNCR YTYSEWHSFT OPRLGEKVYL 840 

CNQDENHKHA QDYVLTYNYE GRGSVAGSVG CCSERQEEDG LEFLDNLEPK FRTLAEACMK 900 
R 

Seq ID NO: 618 DNA sequence 

Nucleic Acid Accession ft: NMJJ04949.1 

Coding sequence: 202.. 2745 

1 11 21 31 41 51 

r i i i i i 

CGCCAAAGGA AAAGCCCCTT GGATGAGAGG CAGGCGCTTC AGAGAAGCTA AGAAAAGCAC 60 

CTCTCCGCGC GCCCCACCTC CTCCGCCTCG CGCTCCTCCT GAGCAGCGGG CCCAGACTGC 120 

GCTCCGGCCG CGGCCCTCGC CCCGCGGAGC CCTCCTACCC CGGCCCGACG CTCGGCCCGC 180 

GACCTGCCCC GAGCCCTCTC CATGGAGGCA GCCCGCCCCT CCGGCTCCTG GAAOGGAGCC 240 

CTCTGCCGGC TGCTCCTGCT GACCCTCGCG ATCTTAATAT TTGCCAGTGA TGCCTGCAAA 300 

AATGTGACAT TACATGTTCC CTCCAAACTA GATGCCGAGA AACTTGTTGG TAGAGTTAAC . 360 

CTGAAAGAGT GCTTTACAGC TGCAAATCTA ATTCATTCAA GTGATCCTGA CTTCCAAATT 420 

TTGGAGGATG GTTCAGTCTA TACAACAAAT ACTATTCTAT TGTCCTCGGA GAAGAGAAGT 480 

TTTACCATAT TACTTTCCAA CACTGAGAAC CAAGAAAAGA AGAAAATATT TGTCTTTTTG 540 

GAGCATCAAA CAAAGGTCCT AAAGAAAAGA CATACTAAAG AAAAAGTTCT AAGGCGCGCC 600 

AAGAGAAGAT GGGCTCCAAT TCCTTGTTCG ATGCTAGAAA ACTCCTTGGG TCCTTTTCCA 660 

CTTTTCCTTC AACAGGTTCA ATCTGACACG GCCCAAAACT ATACCATATA CTATTCCATA 720 

AGAGGTCCTG GAGTTGACCA AGAACCTCGG AATTTATTTT ATGTGGAGAG AGACACTGGA 780 

AACTTGTATT GTACTCGTCC TGTAGATCGT GAGCAGTATG AATCTTTTGA GATAATTGCC 840 

TTTGCAACAA CTCCAGATGG GTATACTCCA GAACTTCCAC TGCCCCTAAT AATCAAAATA 900 

GAGGATGAAA ATGATAACTA CCCAATTTTT ACAGAAGAAA CTTATACTTT TACAATTTTT 960 

GAAAATTGCA GAGTGGGCAC TACTGTGGGA CAAGTGTGTG CTACTGACAA AGATGAGCCT 1020 

GACACGATGC ACACACGCCT GAAGTACTCC ATCATTGGGC AGGTGCCACC ATCACCCACC 1080 

CTATTTTCTA TGCATCCAAC TACAGGCGTG ATCACCACAA CATCATCTCA GCTAGACAGA 1140 

GAGTTAATTG ACAAGTACCA GTTGAAAATA AAAGTACAAG ACATGGATGG TCAGTATTTT 1200 

GGTCTACAGA CAACTTCAAC TTGTATCATT AACATTGATG ATGTAAATGA CCACTTGCCA 1260 

ACATTTACTC GTACTTCTTA TGTGACATCA GTGGAAGAAA ATACAGTTGA TGTGGAAATC 1320 

TTACGAGTTA CTGTTGAGGA TAAGGACTTA GTGAATACTG CTAACTGGAG AGCTAATTAT 1380 

ACCATTTTAA AGGGCAATGA AAATGGCAAT TTTAAAATTG TAACAGATGC CAAAACCAAT 1440 

GAAGGAGTTC TTTGTGTAGT TAAGCCTTTG AATTATGAAG AAAAGCAACA GATGATCTTG 1500 

CAAATTGGTG TAGTTAATGA AGCTCCATTT TCCAGAGAGG CTAGTCCAAG ATCAGCCATG 1560 

AGCACAGCAA CAGTTACTGT TAATGTAGAA GATCAGGATG AGGGCCCTGA GTGTAACCCT 1620 

CCAATACAGA CTGTTCGCAT GAAAGAAAAT GCAGAAGTGG GAACAACAAG CAATGGATAT 1680 

AAAGCATATG ACCCAGAAAC AAGAAGTAGC AGTGGCATAA GGTATAAGAA ATTAACTGAT 1740 

CCAACAGGGT GGGTCACCAT TGATGAAAAT ACAGGATCAA TCAAAGTTTT CAGAAGCCTG 1800 

GATAGAGAGG CAGAGACCAT CAAAAATGGC ATATATAATA TTACAGTCCT TGCATCAGAC I860 

CAAGGAGGGA GAACATGTAC GGGGACACTG GGCATTATAC TTCAAGACGT GAATGATAAC 1920 

AGCCCATTCA TACCTAAAAA GACAGTGATC ATCTGCAAAC CCACCATGTC ATCTGCGGAG 1980 

ATTGTTGCGG TTGATCCTGA TGAGCCTATC CATGGCCCAC CCTTTGACTT TAGTCTGGAG 2040 

AGTTCTACTT CAGAAGTACA GAGAATGTGG AGACTGAAAG CAATTAATGA TACAGCAGCA 2100 

CGTCTTTCCT ATCAGAATGA TCCTCCATTT GGCTCATATG TAGTACCTAT AACAGTGAGA 2160 

GATAGACTTG GCATGTCTAG TGTCACTTCA TTGGATGTTA CACTGTGTGA CTGCATTACC 2220 

GAAAATGACT GCACACATCG TGTAGATCCA AGGATTGGCG GTGGAGGAGT ACAACTTGGA 2280 

AAGTGGGCCA TCCTTGCAAT ATTGTTGGGC ATAGCATTGC TCTTTTGCAT CCTGTTTACG 2340 

CTGGTCTGTG GGGCTTCTGG GACGTCTAAA CAACCAAAAG TAATTCCTGA TGATTTAGCC 2400 

CAGCAGAACC TAATTGTATC AAACACAGAA GCTCCTGGAG ATGACAAAGT GTATTCTGCG 2460 

AATGGCTTCA CAACCCAAAC TGTGGGCGCT TCTGCTCAGG GAGTTTGTGG CACCGTGGGA 2520 

TCAGGAATCA AAAACGGAGG TCAGGAGACC ATCGAAATGG TGAAAGGAGG ACACCAGACC 2580 

TCGGAATCCT GCCGGGGGGC TGGCCACCAT CACACCCTGG ACTCCTGCAG GGGAGGACAC 2640 

ACGGAGGTGG ACAACTGCAG ATACACTTAC TCGGAGTGGC ACAGTTTTAC TCAGCCCCGT 2700 

CTTGGTGAAG AATCCATTAG AGGACACACT CTGATTAAAA ATTAAACAAT GAAAGAAAGT 2760 

GTATCTGTGT AATCAAGATG AAAATCACAA GCATGCCCAA GACTATGTCC TGACATATAA 2820 

CTATGAAGGA AGAGGATCGG TGGCTGGGTC TGTAGGTTGT TGCAGTGAAC GACAAGAAGA 2880 

AGATGGGCTT GAATTTTTGG ATAATTTGGA GCCCAAATTT AGGACACTAG CAGAAGCATG 2940 

CATGAAGAGA TGAGTGTGTT CTAATAAGTC TCTGAAAGCC AGTGGCTTTA TGACTTTTAA 3000 

AAAAAATTAC AAACCAAGAA TTTTTTAAAG CAGAAGATGC TATTTGTGGG GGTTTTTCTC 3060 

TCATTATTTG GATGGAATCT CTTTGGTCAA ATGCACATTT ACAGAGAGAC ACTATAAACA 3120 

AGTACACAAA TTTTTCAATT TTTACATATT TTTAAATTAC TTATCTTCTA TCCAAGGAGG 3180 

TCTACAGAGA AATTAAAGTC TGCCTTATTT GTTACATTTG GGTATAATGA CAACAGCCAA 3240 

TTTATAGTGC AATAAAATGT AATTAATTCA AGTCCTTATT ATAGACTATT TGAAGCACAA 3300 

CCTAATGGAA AATTGTAGAG ACCTTGCTTT AACATTATCT CCAGTTAATT AAGTGTTCAT 3360 

GTGGTGCTTG GAAACTGTTG TTTTCCTGAA CATCTAAAGT GTGTAGACTG CATTCTTGCT 3420 

ATTATTTTAT TCTTGTAATG TGACCTTTTC ACTGTGCAAA GGGAGATTTC TAGCCAGGCA 3480 
TTGACTATTA CAATTTCATT 

Seq ID NO: 619 Protein sequence 
Protein Accession 8: Np_004940.1 

1 11 21 31 41 51 

I I I I I I 

MEAARPSGSW NGALCRLLLL TLAILIFASD ACKNVTLHVP SKLDAEXLVG RVNLKECFTA 60 
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GCGTCAAAAG GAACGATGCA GGATCCTATG AATGTGAAAT ACAGAACCCA GCGAGTGCCA 720 

ACOGCAGTGA CCCAGTCACC CTGAATGTCC TCTATGGCCC AGATGTCCCC ACCATTTCCC 780 

CCTCAAAGGC CAATTACCGT CCAGGGGAAA ATCTGAACCT CTCCTGCCAC GCAGCCTCTA 840 

ACCCACCTGC ACAGTACTCT TGGTTTATCA ATGGGAOGTT CCAGCAATCC ACACAAGAGC 900 

TCTTTATCCC CAACATCACT GTGAATAATA GCGGATCCTA TATGTGCCAA GCCCATAACT 960 

CAGCCACTGG CCTCAATAGG ACCACAGTCA CGATGATCAC AGTCTCTGGA AGTGCTCCTG 1020 

TCCTCTCAGC TGTGGCCACC GTCGGCArCA CGATTGGAGT GCTGGOCAGG GTGGCTCTGA 1080 

TATAGCAGCC CTGGTGTATT TTCGATATTT CAGGAAGACT GGCAGATTGG ACCAGACCCT .1140 

GAATTCTTCT AGCTCCTCCA ATCCCATTTT ATCCCATGGA ACCACTAAAA ACAAGGTCTG 1200 

CTCTGCTCCT GAAGCCCTAT ATGCTGGAGA TGGACAACTC AATGAAAATT TAAAGGGAAA 1260 

ACCCTCAGGC CTGAGGTGTG TGCCACTCAG AGACTTCACC TAACTAGAGA CAGTCAAACT 1320 

GCAAACCATG GTGAGAAATT GACGACTTCA CACTATGGAC AGCTTTTCCC AAGATGTCAA 1380 

AACAAGACTC CTCATCATGA TAAGGCTCTT ACCCCCTTTT AATTTGTCCT TGCTTATGCC 1440 

TGCCTCTTTC GCTTGGCAGG ATGATGCTGT CATTAGTATT TCACAAGAAG TAGCTTCAGA 1500 

GGGTAACTTA ACAGAGTGTC AGATCTATCT TGTCAATCCC AACGTTTTAC ATAAAATAAG 1560 

AGATCCTTTA GTGCACCCAG TGACTGACAT TAGCAGCATC TTTAACACAG CCGTGTGTTC 1620 

AAATGTACAG TGGTCCTTTT CAGAGTTGGA CTTCTAGACT CACCTGTTCT CACTCCCTGT 1680 

TTTAATTCAA CCCAGCCATG CAATGCCAAA TAATAGAATT GCTCCCTACC AGCTGAACAG 1740 

GGAGGAGTCT GTGCAGTTTC TGACACTTGT TGTTGAACAT GGCTAAATAC AATGGGTATC 1800 

GCTGAGACTA AGTTGTAGAA ATTAACAAAT GTGCTGCTTG GTTAAAATGG CTACACTCAT 1860 

CTGACTCATT CTTTATTCTA TTTTAGTTGG TTTGTATCTT GCCTAAGGTG CGTAGTCCAA 1920 

CTCTTGGTAT TACCCTCCTA ATAGTCATAC TAGTAGTCAT ACTCCCTGGT GTAGTGTATT 1980 

CTCTAAAAGC TTTAAATGTC TGCATGCAGC CAGCCATCAA ATAGTGAATG GTCTCTCTTT 2040 

GGCTGGAATT ACAAAACTCA GAGAAATGTG TCATCAGGAG AACATCATAA CCCATGAAGG 2100 

ATAAAAGCCC CAAATGGTGG TAACTGATAA TAGCACTAAT GCTTTAAGAT TTGGTCACAC 2160 

TCTCACCTAG GTGAGCGCAT TGAGCCAGTG GTGCTAAATG CTACATACTC CAACTGAAAT 2220 

GTTAAGGAAG AAGATAGATC CAATTAAAAA AAATTAAAAC CAATTTAAAA AAAAAAAAGA 2280 

ACACAGGAGA TTCCAGTCTA CTTGAGTTAG CATAATACAG AAGTCCCCTC TACTTTAACT 2340 

TTTACAAAAA AGTAACCTGA ACTAATCTGA TGTTAACCAA TGTATTTATT TCTGTGGTTC 2400 

TGTTTCCTTG TTCCAATTTG ACAAAACCCA CTGTTCTTGT ATTGTATTGC CCAGGGGGAG 2460 

CTATCACTGT ACTTGTAGAG TGGTGCTGCT TTAATTCATA AATCACAAAT AAAAGCCAAT 2520 
TAGCTCTATA ACT 

Seq ID NOt 625 Protein sequence 
Protein Accession th AAA59907.1 

1 11 21 31 41 51 

I I I I I I 

MGPPSAPPCR LHVPWKEVLL TASLLTFWNP PTTAKLTIES TPFNVAEGKE VLLLAHNLiPQ 60 

NRIGYSWYKG ERVDGNSLIV GYVIGTQQAT PGPAYSGRET IYPNASLLIQ NVTQNDTGPY 120 

TLQVIKSDLV NEEATGQFHV YPELPKPSIS SNNSNPVEDK DAVAFTCEPE VQNTTYLWWV 180 

NGQSLPVSPR LQLSNGNMTL TLLSVKRNDA GSYECEIQNP ASANRSDPVT LNVLYGPDVP 240 

TISPSKANYR PGENLNLSCH AASNPPAQYS WFINGTFQQS TQBIiFIPNIT VNNSGSYMCQ 300 
AHNSATGLNR TTVTMITVSG SAPVLSAVAT VGITIGVLAR VALI 

Seq ID NO i 626 DNA sequence 
Nucleic Acid Accession #: M18728.1 
Coding sequence: 13 5S.. 165 7 

1 11 21 31 41 51 

1 I I I I I 

GGAGCTCAAG CTCCTCTACA AAGAGGTGGA CAGAGAAGAC AGCAGAGACC ATGGGACCCC 60 

CCTCAGCCCC TCCCTGCAGA TTGCATGTCC CCTGGAAGGA GGTCCTGCTC ACAGCCTCAC 120 

TTCTAACCTT CTGGAACCCA CCCACCACTG CCAAGCTCAC TATTGAATCC ACGCCATTCA 180 

ATGTCGCAGA GGGGAAGGAG GTTCTTCTAC TCGCCCACAA CCTGCCCCAG AATCGTATTG 240 

GTTACAGCTG GTACAAAGGC GAAAGAGTGG ATGGCAACAG TCTAATTGTA GGATATGTAA 300 

7AGGAACTCA ACAAGCTACC CCAGGGCCCG CATACAGTGG TCGAGAGACA ATATACCCCA 360 

ATGCATCCCT GCTGATCCAG AACGTCACCC AGAATGACAC AGGATTCTAT ACCCTACAAG 420 

TCATAAAGTC AGATCTTGTG AATGAAGAAG CAACCGGACA GTTCCATGTA TACCCGGAGC 480 

TGCCCAAGCC CTCCATCTCC AGCAACAACT CCAACCCCGT GGAGGACAAG GATGCTGTGG 540 

CCTTCACCTG TGAACCTGAG GTTCAGAACA CAACCTACCT GTGGTGGGTA AATGGTCAGA 600 

GCCTCCCGGT CAGTCCCAGG CTGCAGCTGT CCAATGGCAA CATGACCCTC ACTCTACTCA 660 

GCGTCAAAAG GAACGATGCA GGATCCTATG AATGTGAAAT ACAGAACCCA GCGAGTGCCA 720 

ACCGCAGTGA CCCAGTCACC CTGAATGTCC TCTATGGCCC AGATGTCCCC ACCATTTCCC 780 

CCTCAAAGGC CAATTACCGT CCAGGGGAAA ATCTGAACCT CTCCTGCCAC GCAGCCTCTA 840 

ACCCACCTGC ACAGTACTCT TGGTTTATCA ATGGGACGTT CCAGCAATCC ACACAAGAGC 900 

TCTTTATCCC CAACATCACT GTGAATAATA GCGGATCCTA TATGTGCCAA GCCCATAACT 960 

CAGCCACTGG CCTCAATAGG ACCACAGTCA CGATGATCAC AGTCTCTGGA AGTGCTCCTG 1020 

TCCTCTCAGC TGTGGCCACC GTCGGCATCA CGATTGGAGT GCTGGCCAGG GTGGCTCTGA 1080 

TATAGCAGCC CTGGTGTATT TTCGATATTT CAGGAAGACT GGCAGATTGG ACCAGACCCT 1140 

GAATTCTTCT AGCTCCTCCA ATCCCATTTT ATCCCATGGA ACCACTAAAA ACAAGGTCTG 1200 

CTCTGCTCCT GAAGCCCTAT ATGCTGGAGA TGGACAACTC AATGAAAATT TAAAGGGAAA 1260 

ACCCTCAGGC CTGAGGTGTG TGCCACTCAG AGACTTCACC TAACTAGAGA CAGTCAAACT 1320 

GCAAACCATG GTGAGAAATT GACGACTTCA CACTATGGAC AGCTTTTCCC AAGATGTCAA 1380 

AACAAGACTC CTCATCATGA TAAGGCTCTT ACCCCCTTTT AATTTGTCCT TGCTTATGCC 1440 

TGCCTCTTTC GCTTGGCAGG ATGATGCTGT CATTAGTATT TCACAAGAAG TAGCTTCAGA 1500 

GGGTAACTTA ACAGAGTGTC AGATCTATCT TGTCAATCCC AACGTTTTAC ATAAAATAAG 1560 

AGATCCTTTA GTGCACCCAG TGACTGACAT TAGCAGCATC TTTAACACAG CCGTGTGTTC 1620 

AAATGTACAG TGGTCCTTTT CAGAGTTGGA CTTCTAGACT CACCTGTTCT CACTCCCTGT 1680 

TTTAATTCAA CCCAGCCATG CAATGCCAAA TAATAGAATT GCTCCCTACC AGCTGAACAG 1740 

GGAGGAGTCT GTGCAGTTTC TGACACTTGT TGTTGAACAT GGCTAAATAC AATGGGTATC 1800 

GCTGAGACTA AGTTGTAGAA ATTAACAAAT GTGCTGCTTG GTTAAAATGG CTACACTCAT 1860 

CTGACTCATT CTTTATTCTA TTTTAGTTGG TTTGTATCTT GCCTAAGGTG CGTAGTCCAA 1920 

CTCTTGGTAT TACCCTCCTA ATAGTCATAC TAGTAGTCAT ACTCCCTGGT GTAGTGTATT 1980 

CTCTAAAAGC TTTAAATGTC TGCATGCAGC CAGCCATCAA ATAGTGAATG GTCTCTCTTT 2040 

GGCTGGAATT ACAAAACTCA GAGAAATGTG TCATCAGGAG AACATCATAA CCCATGAAGG 2100 

ATAAAAGCCC CAAATGGTGG TAACTGATAA TAGCACTAAT GCTTTAAGAT TTGGTCACAC 2160 
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TCTCACCTAG GTGAGOGCAT TGAGCCAGTG GTGCTAAATG CTACATACTC CAACTGAAAT 2220 

GTTAAGGAAG AAGATAGATC CAATTAAAAA AAATTAAAAC CAATTTAAAA AAAAAAAAGA 2280 

ACACAGGAGA TTCCAGTCTA CTTGAGTTAG CATAATACAG AAGTCCCCTC TACTTTAACT 2340 

TTTACAAAAA AGTAACCTGA ACTAATCTGA TGTTAACCAA TGTATTTATT TCTGTGGTTC 2400 

TGTTTCCTTG TTCCAATTTG ACAAAACCCA CTGTTCTTGT ATTGTATTGC CCAGGGGGAG 2460 

CTATCACTGT ACTTGTAGAG TGGTGCTGCT TTAATTCATA AATCACAAAT AAAAGCCAAT 2520 
TAGCTCTATA ACT 

Seq ID NO i 627 Protein sequence 
Protein Accession ft: AAA59908.1 

1 11 21 31 41 51 

I I I I I I 

MDSFSQDVKT RLLIMIRLLP PFNLSLLMPA SPAWQDDAVI SISQEVASEG NLTECQIYbV 60 

NPNVLHKIRD PLVHPVTDIS SIFNTAVCSN VQWSFSELOF 

Seq ID HO: 628 DNA sequence 
Nucleic Acid Accession 8: M18728.1 
Coding sequence: 2370.. 2501 

1 11 21 31 41 51 

I I I I I I 

GGAGCTCAAG CTCCTCTACA AAGAGGTGGA CAGAGAAGAC AGCAGAGACC ATGGGACCCC 60 

CCTCAGCCCC TCCCTGCAGA TTGCATGTCC CCTGGAAGGA GGTCCTGCTC ACAGCCTCAC 120 

TTCTAACCTT CTGGAACCCA CCCACCACTG CCAAGCTCAC TATTGAATCC ACGCCATTCA 180 

ATGTCGCAGA GGGGAAGGAG GTTCTTCTAC TCGCCCACAA CCTGCCCCAG AATCGTATTG 240 

GTTACAGCTG GTACAAAGGC GAAAGAGTGG ATGGCAACAG TCTAATTGTA GGATATGTAA 300 

TAGGAACTCA ACAAGCTACC CCAGGGCCCG CATACAGTGG TCGAGAGACA ATATACCCCA 360 

ATGCATCCCT GCTGATCCAG AACGTCACCC AGAATGACAC AGGATTCTAT ACCCTACAAG 420 

TCATAAAGTC AGATCTTGTG AATGAAGAAG CAACCGGACA GTTCCATGTA TACCCGGAGC 480 

TGCCCAAGCC CTCCATCTCC AGCAACAACT CCAACCCCGT GGAGGACAAG GATGCTGTGG 540 

CCTTCACCTG TGAACCTGAG GTTCAGAACA CAACCTACCT GTGGTGGGTA AATGGTCAGA 600 

GCCTCCCGGT CAGTCCCAGG CTGCAGCTGT CCAATGGCAA CATGACCCTC ACTCTACTCA 660 

GCGTCAAAAG GAACGATGCA GGATCCTATG AATGTGAAAT ACAGAACCCA GCGAGTGCCA 720 

ACCGCAGTGA CCCAGTCACC CTGAATGTCC TCTATGGCCC AGATGTCCCC ACCATTTCCC 780 

CCTCAAAGGC CAATTACCGT CCAGGGGAAA ATCTGAACCT CTCCTGCCAC GCAGCCTCTA' 840 

ACCCACCTGC ACAGTACTCT TGGTTTATCA ATGGGACGTT CCAGCAATCC ACACAAGAGC 900 

TCTTTATCCC CAACATCACT GTGAATAATA GCGGATCCTA TATGTGCCAA GCCCATAACT 960 

CAGCCACTGG CCTCAATAGG ACCACAGTCA CGATGATCAC AGTCTCTGGA AGTGCtCCTG 1020 

TCCTCTCAGC TGTGGCCACC GTCGGCATCA CGATTGGAGT GCTGGCCAGG GTGGCTCTGA 1080 

TATAGCAGCC CTGGTGTATT TTCGATATTT CAGGAAGACT GGCAGATTGG ACCAGACCCT 1140 

GAATTCTTCT AGCTCCTCCA ATCCCATTTT ATCCCATGGA ACCACTAAAA ACAAGGTCTG 1200 

CTCTGCTCCT GAAGCCCTAT ATGCTGGAGA TGGACAACTC AATGAAAATT TAAAGGGAAA 1260 

ACCCTCAGGC CTGAGGTGTG TGCCACTCAG AGACTTCACC TAACTAGAGA CAGTCAAACT 1320 

GCAAACCATG GTGAGAAATT GACGACTTCA CACTATGGAC AGCTTTTCCC AAGATGTCAA 1380 

AACAAGACTC CTCATCATGA TAAGGCTCTT ACCCCCTTTT AATTTGTCCT TGCTTATGCC 1440 

TGCCTCTTTC GCTTGGCAGG ATGATGCTGT CATTAGTATT TCACAAGAAG TAGCTTCAGA 1500 

GGGTAACTTA ACAGAGTGTC AGATCTATCT TGTCAATCCC AACGTTTTAC ATAAAATAAG 1560 

AGATCCTTTA GTGCACCCAG TGACTGACAT TAGCAGCATC TTTAACACAG CCGTGTGTTC 1620 

AAATGTACAG TGGTCCTTTT CAGAGTTGGA CTTCTAGACT CACCTGTTCT CACTCCCTGT 1680 

TTTAATTCAA CCCAGCCATG CAATGCCAAA TAATAGAATT GCTCCCTACC AGCTGAACAG 1740 

GGAGGAGTCT GTGCAGTTTC TGACACTTGT TGTTGAACAT GGCTAAATAC AATGGGTATC 1800 

GCTGAGACTA AGTTGTAGAA ATTAACAAAT GTGCTGCTTG GTTAAAATGG CTACACTCAT 1860 

CTGACTCATT CTTTATTCTA TTTTAGTTGG TTTGTATCTT GCCTAAGGTG CGTAGTCCAA 1920 

CTCTTGGTAT TACCCTCCTA ATAGTCATAC TAGTAGTCAT ACTCCCTGGT GTAGTGTATT 1980 

CTCTAAAAGC TTTAAATGTC TGCATGCAGC CAGCCATCAA ATAGTGAATG GTCTCTCTTT 2040 

GGCTGGAATT ACAAAACTCA GAGAAATGTG TCATCAGGAG AACATCATAA CCCATGAAGG 2100 

ATAAAAGCCC CAAATGGTGG TAACTGATAA TAGCACTAAT GCTTTAAGAT TTGGTCACAC 2160 

TCTCACCTAG GTGAGCGCAT TGAGCCAGTG GTGCTAAATG CTACATACTC CAACTGAAAT 2220 

GTTAAGGAAG AAGATAGATC CAATTAAAAA AAATTAAAAC CAATTTAAAA AAAAAAAAGA 2260 

ACACAGGAGA TTCCAGTCTA CTTGAGTTAG CATAATACAG AAGTCCCCTC TACTTTAACT 2340 

TTTACAAAAA AGTAACCTGA ACTAATCTGA TGTTAACCAA TGTATTTATT TCTGTGGTTC 2400 

TGTTTCCTTG TTCCAATTTG ACAAAACCCA CTGTTCTTGT ATTGTATTGC CCAGGGGGAG 2460 

CTATCACTGT ACTTGTAGAG TGGTGCTGCT TTAATTCATA AATCACAAAT AAAAGCCAAT 2520 
TAGCTCTATA ACT 

Seq ID NO: 629 Protein sequence 
Protein Accession #: AAA59909.1 

1 11 21 31 41 51 

I I I I I I 

MLTNVFISW LFPCSNLTKP TVLVLYCPGG AITVLVEWCC FNS 



Seq ID NO: 630 DNA sequence 

Nucleic Acid Accession ft: NMJU6639.1 

Coding sequence: 40.. 429 

1 11 21 31 41 51 

I ( I I I I 

GCGGCGGGCG CAGACAGCGG CGGGCGCAGG ACGTGCACTA TGGCTCGGGG CTCGCTGCGC 60 
CGGTTGCTGC GGCTCCTCGT GCTGGGGCTC TGGCTGGCGT TGCTGCGCTC CGTGGCCGGG 120 
GAGCAAGCGC CAGGCACCGC CCCCTGCTCC CGCGGCAGCT CCTGGAGCGC GGACCTGGAC 180 
AAGTGCATGG ACTGCGCGTC TTGCAGGGCG CGACCGCACA GCGACTTCTG CCTGGGCTGC 240 
GCTGCAGCAC CTCCTGCCCC CTTCCGGCTG CTTTGGCCCA TCCTTGGGGG CGCTCTGAGC 300 
CTGACCTTCG TGCTGGGGCT GCTTTCTGGC TTTTTGGTCT GGAGACGATG CCGCAGGAGA 360 
GAGAAGTTCA CCACCCCCAT AGAGGAGACC GGCGGAGAGG GCTGCCCAGC TGTGGCGCTG 420 
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ATCCAGTGAC AATGTGCCCC CTGCCAGCCG GGGCTCGCCC ACTCATCATT CATTCATCCA 480 

TTCTAGAGCC AGTCTCTGCC TCCCAGACGC GGCGGGAGCC AAGCTCCTCC AACCACAAGG 540 

GGGGTGGGGG GCGGTGAATC ACCTCTGAGG CCTGGGCCCA GGGTTCAGGG GAACCTTCCA 600 

AGGTGTCTGG TTGCCCTGCC TCTGGCTCCA GAACAGAAAG GGAGCCTCAC GCTGGCTCAC 660 

ACAAAACAGC TGACACTGAC TAAGGAACTG CAGCATTTGC ACAGGGGAGG GGGGTGCCCT 720 

CCTTCCTTAG GACCTGGGGG CCAGGCTGAC TTGGGGGGCA GACTTGACAC TAGGCCCCAC 780 

TCACTCAGAT GTCCTGAAAT TCCACCACGG GGGTCACCCT GGGGGGTTAG GGACCTATTT 840 

TTAACACTAG GGGCTGGCCC ACTAGGAGGG CTGGCCCTAA GATACAGACC CCCCCAACTC 900 

CCCAAAGCGG GGAGGAGATA TTTATTTTGG GGAGAGTTTG GAGGGGAGGG AGAATTTATT 960 
AATAAAAGAA TCTTTAACTT TAAAAAAAAA AAAAAAAA 

Seq ID NO: 631 Protein sequence 
Protein Accession #: NP_057723.1 

1 11 21 31 41 51 

MARGSLRRLIi RLLVLGLWLA LLRSVAGEQA PGTAPCSRGS SWSADLDKCM DCASCRARPH 60 

SDFCLGCAAA PPAPFRLLWP ILGGALSLTF VLGLLSGFLV WRRCRRREKF TTPIEETGGE 120 
GCPAVALIQ 

Seq ID NOt 632 DNA sequence 

Nucleic Acid Accession #: NMJ)03B16.1 

Coding sequence: 79.. 2538 

1 11 21 31 41 51 

I I I I I I 

CGGCAGGGTT GGAAAATGAT GGAAGAGGCG GAGGTGGAGG CGACCGAGTG CTGAGAGGAA 60 

CCTGCGGAAT CGGCCGAGAT GGGGTCTGGC GCGCGCTTTC CCTCGGGGAC CCTTCGTGTC 120 

CGGTGGTTGC TGTTGCTTGG CCTGGTGGGC CCAGTCCTCG GTGCGGCGCG GCCAGGCTTT 180 

CAACAGACCT CACATCTTTC TTCTTATGAA ATTATAACTC CTTGGAGATT AACTAGAGAA 240 

AGAAGAGAAG CCCCTAGGCC CTATTCAAAA CAAGTATCTT ATGTTATTCA GGCTGAAGGA 300 

AAAGAGCATA TTATTCACTT GGAAAGGAAC AAAGACCTTT TGCCTGAAGA TTTTGTGGTT 360 

TATACTTACA ACAAGGAAGG GACTTTAATC ACTGACCATC CCAATATACA GAATCATTGT 420 

CATTATCGGG GCTATGTGGA GGGAGTTCAT AATTCATCCA TTGCTCTTAG CGACTGTTTT 480 

GGACTCAGAG GATTGCTGCA TTTAGAGAAT GCGAGTTATG GGATTGAACC CCTGCAGAAC 540 

AGCTCTCATT TTGAGCACAT CATTTATCGA ATGGATGATG TCTACAAAGA GCCTCTGAAA 600 

TGTGGAGTTT CCAACAAGGA TATAGAGAAA GAAACTGCAA AGGATGAAGA GGAAGAGCCT 660 

CCCAGCATGA CTCAGCTACT TCGAAGAAGA AGAGCTGTCT TGCCACAGAC CCGGTATGTG 720 

GAGCTGTTCA TTGTCGTAGA CAAGGAAAGG TATGACATGA TGGGAAGAAA TCAGACTGCT 780 

GTGAGAGAAG AGATGATTCT CCTGGCAAAC TACTTGGATA GTATGTATAT TATGTTAAAT 840 

ATTCGAATTG TGCTAGTTGG ACTGGAGATT TGGACCAATG GAAACCTGAT CAACATAGTT 900 

GGGGGTGCTG GTGATGTGCT GGGGAACTTC GTGCAGTGGC GGGAAAAGTT TCTTATCACA 960 

CGTCGGAGAC ATGACAGTGC ACAGCTAGTT CTAAAGAAAG GTTTTGGTGG AACTGCAGGA 1020 

ATGGCATTTG TGGGAACAGT GTGTTCAAGG AGCCACGCAG GCGGGATTAA TGTGTTTGGA 1080 

CAAATCACTG TGGAGACATT TGCTTCCATT GTTGCTCATG AATTGGGTCA TAATCTTGGA 1140 

ATGAATCACG ATGATGGGAG AGATTGTTCC TGTGGAGCAA AGAGCTGCAT CATGAATTCA 1200 

GGAGCATCGG GTTCCAGAAA CTTTAGCAGT TGCAGTGCAG AGGACTTTGA GAAGTTAACT 1260 

TTAAATAAAG GAGGAAACTG CCTTCTTAAT ATTCCAAAGC CTGATGAAGC CTATAGTGCT 1320 

CCCTCCTGTG GTAATAAGTT GGTGGACGCT GGGGAAGAGT GTGACTGTGG TACTCCAAAG 1380 

GAATGTGAAT TGGACCCTTG CTGCGAAGGA AGTACCTGTA AGCTTAAATC ATTTGCTGAG 1440 

TGTGCATATG GTGACTGTTG TAAAGACTGT CGQTTCCTTC CAGGAGGTAC TTTATGCCGA 1500 

GGAAAAACCA GTGAGTCTGA TGTTCCAGAG TACTGCAATG GTTCTTCTCA GTTCTGTCAG 1560 

CCAGATGTTT TTATTCAGAA TGGATATCCT TGCCAGAATA ACAAAGCCTA TTGCTACAAC 1620 

GGCATGTGCC AGTATTATGA TGCTCAATGT CAAGTCATCT TTGGCTCAAA AGCCAAGGCT 1680 

GCCCCCAAAG ATTGTTTCAT TGAAGTGAAT TCTAAAGGTG ACAGATTTGG CAATTGTGGT 1740 

TTCTCTGGCA ATGAATACAA GAAGTGTGCC ACTGGGAATG CTTTGTGTGG AAAGCTTCAG 1800 

TGTGAGAATG TACAAGAGAT ACCTGTATTT GGAATTGTGC CTGCTATTAT TCAAACGCCT 1860 

AGTCGAGGCA CCAAATGTTG GGGTGTGGAT TTCCAGCTAG GATCAGATGT TCCAGATCCT 1920 

GGGATGGTTA ACGAAGGCAC AAAATGTGGT GCTGGAAAGA TCTGTAGAAA CTTCCAGTGT 1980 

GTAGATGCTT CTGTTCTGAA TTATGACTGT GATGTTCAGA AAAAGTGTCA TGGACATGGG 2040 

GTATGTAATA GCAATAAGAA TTGTCACTGT GAAAATGGCT GGGCTCCCCC AAATTGTGAG 2100 

ACTAAAGGAT ACGGAGGAAG TGTGGACAGT GGACCTAGAT ACAATGAAAT GAATACTGCA 2160 

TTGAGGGACG GACTTCTGGT CTTCTTCTTC CTAATTGTTC CCCTTATTGT CTGTGCTATT 2220 

TTTATCTTCA TCAAGAGGGA TCAACTGTGG AGAAGCTACT TCAGAAAGAA GAGATCACAA 2280 

ACATATGAGT CAGATGGCAA AAATCAAGCA AACCCTTCTA GACAGCCGGG GAGTGTTCCT 2340 

CGACATGTTT CTCCAGTGAC ACCTCCCAGA GAAGTTCCTA TATATGCAAA CAGATTTGCA 2400 

GTACCAACCT ATGCAGCCAA GCAACCTCAG CAGTTCCCAT CAAGGCCACC TCCACCACAA 2460 

CCGAAAGTAT CATCTCAGGG AAACTTAATT CCTGCCCGTC CTGCTCCTGC ACCTCCTTTA 2520 

TATAGTTCCC TCACTTGATT TTTTTAACCT TCTTTTTGCA AATGTCTTCA GGGAACTGAG 2580 

CTAATACTTT TTTTTTTTCT TGATGTTTTC TTGAAAAGCC TTTCTGTTGC AACTATGAAT 2640 

GAAAACAAAA CACCACAAAA CAGACTTCAC TAACACAGAA AAACAGAAAC TGAGTGTGAG 2700 

AGTTGTGAAA TACAAGGAAA TGCAGTAAAG CCAGGGAATT TACAATAACA TTTCCGTTTC 2760 

CATCATTGAA TAAGTCTTAT TCAGTCATCG GTGAGGTTAA TGCACTAATC ATGGATTTTT 2820 

TGAACATGTT ATTGCAGTGA TTCTCAAATT AACTGTATTG GTGTAAGATT TTTGTCATTA 2880 

AGTGTTTAAG TGTTATTCTG AATTTTCTAC CTTAGTTATC ATTAATGTAG TTCCTCATTG 2940 

AACATGTGAT AATCTAATAC CTGTGAAAAC TGACTAATCA GCTGCCAATA ATATCTAATA 3000 

TTTTTCATCA TGCACGAATT AATAATCATC ATACTCTAGA ATCTTGTCTG TCACTCACTA 3 060 

CATGAATAAG CAAATATTGT CTTCAAAAGA ATGCACAAGA ACCACAATTA AGATGTCATA 3120 

TTATTTTGAA AGTACAAAAT ATACTAAAAG AGTGTGTGTG TATTCACGCA GTTACTCGCT 3180 

TCCATTTTTA TGACCTTTCA ACTATAGGTA ATAACTCTTA GAGAAATTAA TTTAATATTA 3240 

GAATTTCTAT TATGAATCAT GTGAAAGCAT GACATTCGTT CACAATAGCA CTATTTTAAA 3300 

TAAATTATAA GCTTTAAGGT ACGAAGTATT TAATAGATCT AATCAAATAT GTTGATTCAT 3360 

GGCTATAATA AAGCAGGAGC AATTATAAAA TCTTCAATCA ATTGAACTTT TACAAAACCA 3420 

CTTGAGAATT TCATGAGCAC TTTAAAATCT GAACTTTCAA AGCTTGCTAT TAAATCATTT 3480 

AGAATGTTTA CATTTACTAA GGTGTGCTGG GTCATGTAAA ATATTAGACA CTAATATTTT 3540 

CATAGAAATT AGGCTGGAGA AAGAAGGAAG AAATGGTTTT CTTAAATACC TACAAAAAAG 3600 

TTACTGTGGT ATCTATGAGT TATCATCTTA GCTGTGTTAA AAATGAATTT TTACTATGGC 3660 
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AGATATGGTA TGGATCGTAA AATTTTAAGC ACTAAAAATT TTTTCATAAC CTTTCATAAT 
AAAGTTTAAT AATAGGTTTA TTAACTGAAT TTCATTAGTT TTTTAAAAGT GTTTTTGGTT 
TGTGTATATA TACATATACA AATACAACAT TTACAATAAA TAAAATACTT GAAATTCTCA 
AAAAAAAAAA AAAAAAAAAA AAAAA 



PCT/US02/12476 



3720 
3780 
3640 



Seq ID NO» 633 Protein sequence 
Protein Accession #: NP_003807.1 



KGSGARFPSO 
PYSKQVSYVI 
EGVHNSSIAL 
DIEKETAKDE 
LLANYLDSMY 
AQLVLKKGFG 
RDCSCGAKSC 
LVDAGEECDC 
DVPEYCNGSS 
IEVNSKGDRF 
WGVDFQLGSD 
NCHCENGWAP 
DQLWRSYFRK 
KQPQQFPSRP 



11 
I 

TLRVRWLLLL 
QAEGKEHIIH 
SDCFGLRGLL 
EEEPPSMTQL 
IMLNIRIVLV 
GTAGMAFVGT 
IMNSGASGSR 
GTPKECELDP 
QFCQPDVFIQ 
GNCGFSGNEY 
VPDPGMVNEG 
PNCETKGYGG 
KRSQTYESDG 
PPPQPKVSSQ 



21 
I 

GLVGPVLGAA 
LERNKDLLPE 
HLENASYGIE 
LRRRRAVLPQ 
GLEIWTNGNIi 
VCSRSHAGGI 
NFSSCSAEDF 
CCEGSTCKLK 
NGYPCQNNKA 
KKCATGNALC 
TKCGAGKICR 
SVDSGPTYNE 
KNQANPSRQP 
GNLIPARPAP 



31 
I 

RPGFQQTSHL 
DFWYTYNKE 
PLQNSSHFEH 
TRYVELFIW 
INIVGGAGDV 
NVFGQITVET 
EKLTLNKGGN 
SFAECAYGDC 
YCYNGMCQYY 
GKLQCENVQE 
NFQCVDASVL 
MNTALRDGLL 
GSVPRHVSPV 
APPLYSSLT 



Seq ID NO: 634 DNA sequence 

Nucleic Acid Accession #: NM_002091.1 

Coding sequence i 56 . . 503 



AGTCTCTGCT 
CGGCAGTGAG 
AGCGGTCCCG 
CCACTGGGCG 
TGAGAGAGGG 
GAATTTGCTG 
GGCCTTGGGC 
AGGTTCAAAA 
CCCCCAGCTG 
TAAGAGACTG 
AAATATTTGA 
CTTCTGGTTT 
TTTTTATATC 
TAAAAGCTTA 



11 
I 

CTTCCCAGCC 
CTCCCGCTGG 
CTGCCTGCGG 
GTGGGGCACT 
AGCCTGAAGC 
GGTCTCATAG 
AATCAGCAGC 
GGCAAAGTTG 
AACCAGCAAT 
AGTTCTGCAA 
CTATTCTGTA 
AAACTTGTTT 
TAGGCTACCT 
AACACAT 



21 

I 

TCTCCGGCGC 
TCCTGCTGGC 
GCGGAGGGAC 
TAATGGGGAA 
AGCAGCTGAG 
AAGCAAAGGA 
CTTCGTGGGA 
GTAGACTCTC 
GATAATGATG 
GCATCAGTTC 
TCTTTCATCC 
GCTGTGAACA 
GTTGGTTAGA 



31 
I 

GCTCCAAGGG 
GCTGGTCCTC 
CGTGCTGACC 
AAAGAGCACA 
AGAGTACATC 
GAACAGAAAC 
TTCAGAGGAT 
TGCTCCAGGT 
GCCTCTCTCA 
TACGGATCAT 
TTGACTAAAT 
ATTGTCGAAA 
TTCAAGGCCC 



41 

I 

SSYEIITPWR 
GTLITDHPNI 
IIYRMDDVYK 
DKERYDMMGR 
LGNFVQWREK 
FASIVAHELG 
CLLNIPKPDE 
CKDCRFLPGG 
DAQCQVIFGS 
IPVFGIVPAI 
NYDCDVQKKC 
VFFFLIVPLI 
TPPREVPIYA 



41 

I 

CTTCCCGTCG 
TGCCTAGCGC 
AAGATGTACC 
GGGGAGTCTT 
AGGTGGGAAG 
CACCAGCCAC 
AGCAGCAACT 
TCTCAACGTG 
AAAGAGAAAA 
CAACAAGATT 
TCGTGATTTT 
AGAGTCTTCC 
CGAGCTGTTA 



51 
I 

LTRERREAPR 
QNHCHYRGYV 
EPLKCGVSNK 
NQTAVREEMI 
FLITRRRHDS 
HNLGMNHDDG 
AYSAPSCGNK 
TLCRGKTSEC 
KAKAAPKDCF 
IQTPSRGTKC 
HGHGVCNSNK 
VCAIFIFIKR 
NRFAVPTYAA 



51 
I 

GGACCATGCG 
CCCGGGGGCG 
CGCGCGGCAA 
CTTCTGTTTC 
AAGCTGCAAG 
CTCAACCCAA 
TCAAAGATGT 
AAGGAAGGAA 
ACAAAACCCC 
TCCTTGTGCA 
CAAGCAGCAT 
AATTAATGCT 
CCATTCACAA 



Seq ID NO: 635 Protein sequence 
Protein Accession #: NP_002082.1 



11 



21 



31 



41 



51 



MRGSELPLVL LALVLCLAPR GRAVPLPAGG GTVLTKMYPR GNHWAVGHLM GKKSTGESSS 
VSERGSLKQQ LREYIRWEEA ARNLLGLIEA KENRNHQPPQ PKALGNQQPS WDSEDSSNFK 
DVGSKGKVGR LSAPGSQREG RNPQLNQQ 

Seq ID NO: 636 DNA sequence 

Nucleic Acid Accession ft: NM_016522.1 

Coding sequence: 265.. 1299 



1 
I 

GCGGAAGCAG 
CTGGCAAAAG 
TTTTCTCCTC 
CCGCACCCCA 
TCGGGGAAGT 
TGCCTCGTGG 
AGCGGAGATG 
GCCACCCTCA 
ACCATCCTCT 
AACACCCAAA 
TACACCTGCT 
CAAGTATCTC 
ATTAGCCTCA 
TCTCCCAAAG 
CGGGAACAGT 
CGGAGAGTAA 
GTCCCCGTGG 
TTCCAGTGGT 
AACAGACCTT 
TACACTTGCG 
CCAGGCGCCG 
CTGCCTCTTC 
CGGGAAAGGC 
CCAATCAGAT 
GGGAGGGGAA 
CCTTGCAGAT 



11 
I 

CGAGGAGGGA 
CCGAGGCTGG 
CCCGCGCCTC 
CCCACTTCCT 
TGTGGCTGTC 
TCGTGTCTCT 
CCACCTTCCC 
GGTGCACTAT 
ATGCTGGGAA 
CGCAGTACAG 
CGGTGCAGAC 
CCAAAATTGT 
CCTGCATAGC 
CGGTTGGCTT 
CAGGGGACTA 
AGGTCACCGT 
GACAAAAGGG 
ACAAGGATGA 
TCCTCTCAAA 
TGGCCTCCAA 
TCAGCGAGGT 
TGGTCTTGCA 
TGCCGCCACC 
ATATACAAAT 
CAAAGAATAC 
ATTTAGGTAC 



21 

I 

GCCCCCTTTG 
ATTTGGGGGA 
CCGGTCGCCG 
GTGCTCGCCC 
GAGAATGGGG 
CAGGCTGCTG 
CAAAGCTATG 
TGACAACCGG 
TGACAAGTGG 
CATCGAGATC 
AGACAACCAC 
AGAGATTTCT 
AACTGGTAGA 
TGTGAGTGAA 
CGAGTGCAGT 
GAACTATCCA 
GACACTGCAG 
CAAAAGACTG 
ACTCATCTTC 
CAAGCTGGGC 
GAGCAACGGC 
CCTGCTTCTC 
ACCACCACCA 
GAAATTAGAA 
TTTGGGGGGA 
AATGGAGTTT 



31 

I 

GCCGTCCTCC 
GGAATATTAG 
CGGGTTCACC 
GGGGGGCGTG 
GTCTGTGGGT 
TTCCTTGTAC 
GACAACGTGA 
GTCACCCGGG 
TGCCTGGATC 
CAGAACGTGG 
CCAAAGACCT 
TCAGATATCT 
CCAGAGCCTA 
GACGAATACT 
GCCTCCAATG 
CCATACATTT 
TGTGAAGCCT 
ATTGAAGGAA 
TTCAATGTCT 
CACACCAATG 
ACGTCGAGGA 
AAATTTTGAT 
ACACAACAGC 
GAAACACAGC 
AAAGAGTTTT 
TCTTTTCCCA 



41 

I 

GTGGAACCGG 
ACTCGGAGGA 
GCTCAGTCCC 
TGCCGTGCGG 
ACCTGTTCCT 
CCACAGGAGT 
CGGTCCGGCA 
TGGCCTGGCT 
CTCGCGTGGT 
ATGTGTATGA 
CTAGGGTCCA 
CCATTAATGA 
CGGTTACTTG 
TGGAAATTCA 
ACGTGGCCGC 
CAGAAGCCAA 
CAGCAGTCCC 
AGAAAGGGGT 
CTGAACATGA 
CCAGCATCAT 
GGGCAGGCTG 
GTGAGTGCCA 
AATGGCAACA 
CTCATGGGAC 
AAAAAAGAAA 
AACGGGAAGA 



51 
I 

TTTTCCGAGG 
GTCTGCGCGC 
CGCGCTCGCT 
CTGCCGGAGT 
GCCCTGGAAG 
GCCCGTGCGC 
GGGGGAGAGC 
AAACCGCAGC 
CCTTCTGAGC 
CGAGGGCCCT 
CCTCATTGTG 
AGGGAACAAT 
GAGACACATC 
GGGCATCACC 
GCCCGTGGTA 
GGGTACAGGT 
CTCAGCAGAA 
GAAAGTGGAA 
CTATGGGAAC 
GCTATTTGGT 
CGTCTGGCTG 
CTTCCCCACC 
CCGACAGCAA 
AGAAATTTGA 
TTGAAAATTG 
ACACAGCACA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



430 



10 



15 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

CCGGGCTTGG ACCCACTGCA AGCTGCATCG TGCAACCTCT TTGGTGCCAG TGTGGGCAAG 1620 

GGCTCAGCCT CTCTGCCCAC AGACTGCCCC CACGTGGAAC ATTCTGGAGC TGGCCATCCC 1680 

AAATTCAATC AGTCCATAGA GACGAACAGA ATGAGACCTT CCGGCCCAAG CGTGGCGCTT 1740 

CCGGCCCAAG CGTGGCGCTG CGGGCACTTT GGTAGACTGT GCCACCACGG CGTGTGTTGT 1800 
GAAACGTGAA ATAAAAAGAG CAAAAAAAAA AAAAAAAAA 

Seq ID NO i 637 Protein sequence 
Protein Accession ft: NP 057606.1 1 



PCI7US02/12476 



MGVCGYLFLP 
NRVTRVAWLN 
NHPKTSRVHL 
SEDEYLE I QG 
LQCEASAVPS 
LGHTNASIML 



11 
I 

WKCLWVSLR 
RSTILYAGND 
IVQVSPKIVE 
ITREQSGDYE 
AEFQWYKDDK 
FGPGAVSEVS 



21 
I 

LLFLVPTGVP 
KWCLDPRWL 
ISSDISINEG 
CSASNDVAAP 
RLIEGKKGVK 
NGTSRRAGCV 



31 
I 

VRSGDATFPK 
LSNTQTQYSI 
NNISLTCIAT 
WRRVKVTVN 
VENRPFLSKL 
WLLPLLVLHL 



41 



51 



AMDNVTVRQG ESATLRCTID 60 

EIQNVDVYDE GPYTCSVQTD 120 

GRPEPTVTWR HISPKAVGFV 180 

YPPYISEAKG TGVPVGQKGT 240 

IFFNVSEHDY GNYTCVASNK 300 
L.LKP 



Seq ID NO i 638 DNA sequence 
20 Nucleic Acid Accession 8: NM_012261.1 
Coding sequence: 203.. 1045 



i 

GATTTGCTCT 
ACAGAATACG 
CACTCCAGCG 
CCTCATTCGG 
ACTTCGAGTT 
GGAAAATCTC 
TGGGACGACG 
GGCCAGCAAC 
TGAGGTGAAG 
CGCATATGCA 
GGCGACTTGG 
CAAAGACGCA 
CACCCCCGCT 
TGATCCGCAG 
TATCTCAGAT 
GGAAGAAACC 
CGCGATTTAC 
ATCCCAGTAT 
CCAACTGGAT 
CATAGCTACA 
AACCCACGGA 
ATGCTGGGGA 
TGACTCTCCA 
TTGAAAACAT 
TGCTCCCTTG 
TCATGCTCCC 
GTTTAGTGAT 
AAAACGACTA 
GGGGGACCTG 
TTCTCTGGC 



11 
I 

GCCAGCAGCT 
CGCTCCCTCC 
GCGACTTTGA 
GGCACTGCGA 
CTCCTGATGT 
TCAGGCCTTT 
TGTCTCATGG 
TACGTAGATC 
GGCCGCTGTG 
CTCAAAATGC 
AGGCTGAGCA 
GTCAGTGCTG 
GGGAAGTCCT 
AAGACGGTCA 
TTTGTCTTCA 
TTGCCCCTGA 
CACGTCCACC 
AAGCACATGG 
CAGGTAGAAC 
ATCAAACAGG 
AGGGGGAGAC 
GGAGGGGAGG 
AAGAGCAATA 
GCTTCTTTGA 
GACACAGCTG 
TGCAGCAAGA 
TGTCTTGGGA 
ATGTAACTAT 
AAGAATCAAT 



21 
I 

GTCGGTGCCG 
CTCCCCCTTC 
GGGATTCCCT 
GTATGGATCT 
TGTTCCATAC 
CCACTAACCC 
CAGAGTTTGC 
TGATCACAGA 
GCCACAGCCA 
TCTTTGTAAA 
AAGTGCAGTT 
GGAAGCACAC 
ATGAGTGTCA 
CCATGATCCT 
GTGAAGAGCA 
TTTTGGGGCT 
ACAAAATGAC 
GCTAGAGGCC 
AACAAAAGCA 
CCTGGGTATC 
TCTTTCGGAT 
AGGGTCTCAG 
AATGCCACTT 
GGAGGAAACC 
GCTTATCCTA 
CCCCTGAAAG 
ATGTTTCACT 
GCAGAGTTGT 
CTGTGTGAGT 



31 

1 

CGCTOGACAC 
TCTGTCCCCC 
CTCTGGCGGC 
CCAAGGAAGA 
AATGGCTCAA 
TGAAAAAGAT 
AGCCAAATTT 
ACAGGCCGAT 
GTCGGAGCTG 
GGAAAGCCAC 
TGTCTACGAC 
AGCCAACTCG 
AGCTCAACAA 
GTCTGCGGTC 
TAAATGCCCA 
CATCTTGGGC 
TGCCAACCAG 
GTTAGGCAGG 
CTTTTCCATC 
TGAGGCTTGC 
TTGTAGGGTG 
ACAGCTTTCG 
GGAGCTGTAT 
CCTTTAGGTT 
TACAGTTGTC 
TGATTCATGC 
GCTACCCGCA 
TTGGACTTCT 
CTGTTTTTCA 



41 

I 

CGAGTCCTAG 
GCCTCTCGCT 
CTCTGCAGCA 
GGGGTCCCCA 
ATCATGGCAG 
ATATTTGTGG 
ATTGTACCTT 
ATCGCATTGA 
CAAGTGTTCT 
AACATGTCCA 
TCCTCGGAGA 
CACCACCTCT 
ACCATTTCAC 
CACATCCAAC 
GTGGATGAGC 
CTCGTCATCA 
GTGCAGATCC 
CACCCCCTAT 
TTGTACACGA 
TTGGCTTGTG 
AAATGGCAAT 
TGCTCATGGT 
CTGGCCCCAA 
CAGAAGAATA 
AATGCACACA 
TTCTGGCTGG 
TCCAGCGACT 
TCCTGTGCCA 
AAATGAAATA 



51 
I 

CTAGGCGCTC 
CACCCCGGCC 
GCACAGCCGG 
GCATCGACAG 
AACAAGAAGT 
TGCGGGAAAA 
ATGATGTGTG 
CCCGGGGAGC 
GGGTGGATCG 
AGGGACCTGA 
AAACCCACTT 
CTGCCTTGGT 
TGGCCTCTAG 
CTTTTGACAT 
GGGAGCAACT 
TGGTAACACT 
CTCGGGACAG 
TCCTGCTCCC 
GATACACCAA 
TCCATGCTTA 
TATTCTCTCC 
GGCTTGGCTT 
AGTTTAGGGA 
TGGGGTGCTT 
GAATACAACC 
CATTCTGCAT 
GCAGCACCAG 
GGTCCAAGTC 
AAACACACTA 



Seq ID NO: 639 Protein sequence 
Protein Accession 8: NP 036393.1 



11 



21 



31 



41 



51 



GGCACGAGCC 
ACTATGAGCC 
GCGCTGCTCG 
GTCTCTGCTG 
CCCAAAACGA 
GTGGTAGCCT 
AAGAAAGTCA 
ACCATGCATC 
CAGTAAGAAT 
GAAGAGTGTG 
CTAATATAGT 
CAATTGACCA 
TGAAGATAAC 
ATTTCGTATG 
ACTCACTCTT 



11 
I 

AGTCTCCGCG 
TCCCGTCCAG 
CGCTGCTGCT 
TGCTGACAGA 
TTGGTAAACT 
CCCTGAAGAA 
TCCAGAAAAT 
ATAAAATTGC 
AAGAAGGAAG 
GGGGAAAGCC 
ATTTCCACTA 
TATTGTGAGC 
TATTGTATTT 
GAAATAATGT 
CTCATAAAAT 



21 
I 

CCTCCACCCA 
CCGCGCGGCC 
CCTGCTGACG 
GCTGCGTTGC 
GCAGGTQTTC 
CGGGAAGCAA 
TTTGGACAGT 
CCAGTCTTCA 
GGTTGGTTTT 
TACGCTTCTC 
TTTACTGTTA 
AAAGAATCAC 
CTATCATACA 
TTTATTAGTG 
AGGAAATATT 



31 
I 

GCTCAGGAAC 
CGTGTCCCGG 
CCGCCGGGGC 
ACTTGTTTAC 
CCCGCAGGCC 
GTTTGTCTGG 
GGAAACAAGA 
GCGGAGCAGT 
TTTCCATTTT 
CCTGAAGTTT 
TTTTACCTGA 
TGGTTATTAG 
TTCCTTAAAG 
TGCTGTTGAG 
TTAGTTCTGT 



41 

I 

CCGCGAACCC 
GTCCTTCGGG 
CCCTCGCCAG 
GCGTTACGCT 
CGCAGTGCTC 
ACCCGGAAGC 
AAAACTGAGT 
TTTCTGGAGA 
CTACATGGAT 
ACAGCTCAGC 
TAAGTTATTG 
TCTTTCAATG 
TCTTACCGAA 
GGAGGTATCC 
TTTCTTGGGG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1660 
1740 



MDLQGRGVPS IDRLRVLLML FHTMAQIMAE QEVENLSGLS TNPEKDIFW RENGTTCLMA 60 

EFAAKPIVPY DVWASNYVDIi ITEQADIALT RGAEVKGRCG HSQSELQVFW VDRAYALKML 120 

FVKESHNMSK GPEATWRLSK VQFVYDSSEK THFKDAVSAG KHTANSHHLS ALVTPAGKSY 180 

ECQAQQTISL ASSDPQBCTVT MILSAVHIQP FDIISDFVFS EEHKCPVDER EQLEETLPLI 240 
LGLILGLVIM VTLAIYHVHH KMTANQVQIP RDRSQYKHMG 

Seq ID NO: 640 DNA sequence 

Nucleic Acid Accession ft: NM_002 993.1 

Coding sequence: 64.. 408 



51 
I 

TCTCTTGACC 
CTCCTTGTGC 
CGCTGGTCCT 
GAGAGTAAAC 
CAAGGTGGAA 
CCCTTTTCTA 
AACAAAAAAG 
TCCCTGGACC 
TCCCTACTTT 
TAATGAAGTA 
AACCCTTTGG 
AATATTGAAT 
AAGGCTGTGG 
TGTTGTTCTT 
AATATGTTAC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 



431 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



WO 02/086443 

TCTTTACCCT AGGATGCTAT TTAAGTTGTA CTGTATTAGA ACACTGGGTG TGTCATACOG 
TTATCTGTGC AGAATATATT TCCTTATTCA QAATTTCTAA AAATTTAAGT TCTGTAAQGG 
CTAATATATT CTCTTCCTAT GGTTTTAGAT GTTTGATGTC TTCTTAGTAT GGCATAATGT 
CATGATTTAC TCATTAAACT TTGATTTTGT ATGCTATTTT TTCACTATAG GATGACTATA 
ATTCTGGTCA CTAAATATAC ACTTTAGATA GATGAAGAAG CCCAAAAACA GATAAATTCC 
TGATTGCTAA TTTACATAGA AATGTATTCT CTTGGTTTTT TAAATAAAAG CAAAATTAAC 
AATGATCTGT GCTCTGCAAA GTTTTGAAAA TATATTTGAA CAATTTGAAT ATAAATTCAT 
CATTTAGTCC TCAAAATATA TACAGCATTG CTAAGATTTT CAGATATCTA TTGTGGATCT 
TTTAAAGGTT TTGACCATTT TGTTATGAGG AATTATACAT GTATCACATT CACTATATTA 
AAATTGCACT TTTATCTTTT CCTGTGTGTC ATGTTGGTTT TTGGTACTTG TATTGTCATT 
TGGAGAAACA ATAAAAGATT TCTAAACCAA AAAAAAAAAA AAAAAAA 

Seq ID NOt 641 Protein sequence 
Protein Accession 8: NP_0 02 984.1 



PCT/US02/12476 



21 
I 



31 



41 



51 



MSLPSSRAAR VPGPSGSLCA LLALLLLLTP PGPLASAGPV SAVLTELRCT CLRVTLRVNP 
KTIGKLQVFP AGPQCSKVEV VASLKNGKQV CLDPEAPFLK KVIQKILDSG NKKN 

Seq ID NO: 642 DNA sequence 

Nucleic Acid Accession #: NM_013271.1 

Coding sequences 27.. 80 9 



1 

I 

TCCGGAGCCA 
CCGGGGGCGT 
TCTGCGCGCG 
AGACTGGCGC 
TGCAGGAGCT 
GGGCCGAGGC 
TCTGGGGCGC 
CTGCAGCGCA 
CCCAGCTTGT 
ACGACGGCCC 
CCGAGCTGTT 
TGGCAGCCCC 
CTGAGGGCGT 
TGCCTGCACG 
CAGAAGTGCC 
TTACCCCGGC 
GATCTGAGC 



11 

I 

GGCTCGCTGG 
CGGCCTTTTG 
GCCGGTAAAG 
TCCTCGCCGC 
GGCGCGGGCG 
GCAGGAGGCT 
CCCCCGCAAC 
GCTCGCTCGC 
CCCCGCGCCC 
CGCGGGCCCG 
GAGGTACTTG 
GCGCCGCCTC 
GCTGGGGGCG 
CCGCCTCTTG 
CCCGCCATCC 
CAGCCAGCCC 



21 
I 

GGCAGCATGG 
GTGCTGCTGC 
GAACCCCGOG 
TTCCGGCGGT 
CTGGCGCATC 
GAGGATCAGC 
TCTGATCCGG 
GCTCTGCTCC 
GTCCCCGCCG 
GATGCTGAGG 
CTGGGACGGA 
CGCCGTGCCG 
CTGCTGCGTG 
CCACCCTGAG 
CGCCACCAGG 
TCTCACCCGA 



31 
I 

CGGGGTCGCC 
TGCTCGGCCT 
GCCTAAGCGC 
CAGTGCCCCG 
TGCTGGAGGC 
AGGCGCGCGT 
CTCTGGGCCT 
GCGCCCGCCT 
CGGCGCTCCG 
AGGCAGGCGA 
TTCTTGCGGG 
CCGACCACGA 
TGAAACGCCT 
CACTGCCCGG 
ACTTCTCCCC 
GGATCCCTAC 



41 

I 

GCTGCTCTGG 
GTTTCGGCCG 
AGCGTCTCCG 
AGGTGAGGCG 
CGAACGTCAG 
CCTGGCGCAG 
GGACGACGAC 
TGACCCTGCC 
ACCCCGGCCC 
CGAGACACCC 
AAGCGCGGAC 
TGTGGGCTCT 
AGAGACCCCG 
ATCCCGTGCA 
GCCAGCACGT 
CCCCTGGCCC 



51 
I 

GGGCCGCGGG 
CCCCCCGCGC 
CCCTTGGCTG 
GCGGGGGCGG 
GAGCGGGCGC 
CTGCTGCGCG 
CCCGACGCGC 
GCCCTAGCAG 
CCGGTCTACG 
GACGTGGACC 
TCCGAGGGGG 
GAGCTGCCCC 
GCGCCCCAGG 
CCCTGGGACC 
CCAGAGCAAC 
ACAATAACAT 



Seq ID NO: 643 Protein sequence 
Protein Accession ft: NP_037403.1 

1 11 



31 



41 



51 



21 

I I I I 

MAGSPLLWGP RAGGVGLLVL LLLGLFRPPP ALCARPVKEP RGLSAASPPL AETGAPRRFR 
RSVPRGEAAG AVQELARALA HLLEAERQER ARAEAQEAED QQARVLAQLL RVWGAPRNSD 
PALGLDDDPD APAAQLARAL LRARLDPAAL AAQLVPAPVP AAALRPRPPV YDDGPAGPDA 
EEAGDETPDV DPELLRYLLG RILAGSADSE GVAAPRRLRR AADHDVGSEL PPEGVLGALL 
RVKRLETPAP QVPARRLLPP 

Seq ID NO: 644 DNA sequence 
Nucleic Acid Accession ft: NM_002214 
Coding sequence: 681. .2990 



1 
I 

CCCAGAGCCG 
CTGCCGACTT 
GTTGGCCTCC 
TCCCCTCGAC 
TAGGGTGGTT 
CTAAGCTGAT 
TGTCCCGGAG 
TGGCCGTCGA 
GGCCGTAGGG 
CCGAGCCGCG 
GGCCCCGAGG 
GGGGCGGGCT 
TCTGCCTGCA 
CACTTGTTCT 
CCTGTGCCAG 
TTTCAGGTGG 
GCTCAGTTGA 
TTAATACCCA 
ATTTTATGCT 
ATGTCTCAGC 
CTAGAAAAAT 
AAACAGTTTC 
ACAATTTAGA 
TCACTGAGTT 
AAGGAGGTTT 
AAGAGGCTAA 



11 
I 

CCTCCCCCTG 
GTCTTTGCCC 
CTGCCCACCT 
CTCGCCGGCG 
TCCCCCCCAG 
TTATGCAGCA 
CAGGCTGCGG 
AGGAGGTGCT 
GCCCTGAGAT 
GGGTCCGCCT 
TCGCCCGGGA 
GTTTTGCATT 
AAACGACCGG 
TGGACTGGGC 
GTGCCTTGCG 
ATCAAGAAGT 
TTCAATAGAA 
GGTGACACCA 
GAAAGTTCAT 
ATCAATGCAC 
GGCATTTTTC 
ACCATACATT 
CTGCATGCCT 
TGAGAAAGCA 
TGACGCCATG 
AAGATTGCTG 



21 
I 

TTGCTGGCAT 
GCTGCTCCGC 
GTGGAAGCAA 
TACCCTCCCA 
CTTCGGGCTT 
GAAGCCCCAC 
AGCCCTTGCA 
TCTCGCGGAG 
GCCGAGCGGT 
GCTAGGCCTG 
GGCCGAGCCC 
ATGTGCGGCT 
CGAGGTCCCG 
CAAGGTGAAG 
CTGGGTCCAG 
GAACGTTGTG 
TACCCATCTG 
GGAGAAGTGT 
CCTCTGAAGA 
AATAATATAG 
TCCCGTGACT 
AGCATCCACC 
CCCCATGGAT 
GTTCATAGAC 
CTTCAGGCAG 
CTGGTGATGA 



31 
I 

CCCGAGCTTC 
AGACGGGGCT 
CTGCGCTGAT 
CAGATCCAGC 
TGTTTGGGTT 
CGGCTGGAGA 
GAGCCCTCTC 
ACCGCGGGAC 
GCCCGGGCCC 
CGGAAAACGT 
GCGTCCGGAA 
CGGCCCTGGC 
CCTCGTTCCT 
ACAATAGATG 
AATGTGGATG 
ATATTGTTTC 
TGCATGTTAT 
CTATCCAGCT 
AATATCCTGT 
AAAAATTAAA 
TTCGTCTTGG 
CCGAAAGGAT 
ACATCCATGT 
AGAAGATCTC 
CTGTCTGTGA 
CAGATCAGAC 



41 

I 

CTCCCTTGCC 
GCAAAGCTGC 
TGATGCGCCA 
ATCACCCAGT 
TGATTGTGTT 
GAAACAAAAG 
TCCAGTCGCC 
CCGCCGTGCC 
GCTTACCTGC 
CCTAGCGACA 
GGCAGCCAGG 
TTTTTTTACC 
CTGGGCAGCC 
TGCATCTTCA 
GTGTGTTCAA 
CAATTTAATA 
AATACCCACT 
GCGTCCAGGA 
GGATCTTTAT 
TTCCGTTGGA 
ATTTGGCTCA 
TCATAATCAA 
GCTGTCTTTG 
TGGAAACATA 
AAGTCATATC 
GTCTCATCTC 



960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 



51 
I 

AGCCAGGACG 
AACTAATGGT 
CAGACTTTTT 
GAATGTACAT 
TGGCTCTTCG 
CTCTTTTCTT 
GCCGGGCCCT 
GAGCCGGGAG 
ACCGCTTGCT 
CTCGCCCGCG 
CGGCGGGCGC 
GCTGCATTTG 
TGGGTGTTTT 
AATGCAGCAT 
GAGGATTTCA 
AGCAAAGGCT 
GAAAATGAAA 
GCCGAAGCTA 
TATCTTGTTG 
AACGATTTAT 
TACGTTGATA 
TGCAGTGACT 
ACAGAGAACA 
GATACACCAG 
GGATGGCGAA 
GCTCTTGATA 



60 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
B40 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



432 



WO 02/086443 

GCAAATTGGC AGGCATAGTG GTGCCCAATG ACGGAAACTG TCATCTGAAA AACAACGTCT 1620 

ACGTCAAATC GACAACCATG GAACACCCCT CACTA66CCA ACTTTCAGAG AAATTAATAG 1680 

ACAACAACAT TAATGTCATC TTTGCAGTTC AAGGAAAACA ATTTCATTGG TATAAGGATC 1740 
TTCTACCCCT CTTGCCAGGC ACCATTGCTG GTGAAATAGA ATCAAAGGCT GCAAACCTCA -1800 

ATAATTTGGT AGTGGAAGCC TATCAGAAGC TCATTTCAGA AGTGAAAGTT CAGGTGGAAA 1860 

ACCAGGTACA AGGCATCTAT TTTAACATTA CCGCCATCTG TCCAGATGGG TCCAGAAAGC 1920 

CAGGCATGGA AGGATGCAGA AACGTGACGA GCAATGATGA AGTTCTTTTC AATGTAACAG 1980 

TTACAATGAA AAAATGTGAT GTCACAGGAG GAAAAAACTA TGCAATAATC AAACCTATTG 2040 

GTTTTAATGA AACCGCTAAA ATTCATATAC ACAGAAACTG CAGCTGTCAG TGTGAGGACA 2100 

ACAGAGGACC TAAAGGAAAG TGTGTAGATG AAACTTTTCT AGATTCCAAG TGTTTCCAGT 2160 

GTGATGAGAA TAAATGTCAT TTTGATGAAG ATCAGTTTTC TTCTGAGAGT TGCAAGTCAC 2220 

ACAAGGATCA GCCTGTTTGC AGTGGTCGAG GAGTTTGTGT TTGTGGGAAA TGTTCATGTC 2280 

ACAAAATTAA GCTTGGAAAA GTGTATGGAA AATACTGTGA AAAGGATGAC TTTTCTTGTC 2340 

CATATCACCA TGGAAATCTG TGTGCTGGGC ATGGAGAGTG TGAAGCAGGC AGATGCCAAT 2400 

GCTTCAGTGG CTGGGAAGGT GATCGATGCC AGTGCCCTTC AGCAGCAGCC CAGCACTGTG 2460 

TCAATTCAAA GGGCCAAGTG TGCAGTGGAA GAGGCACGTG TGTGTGTGGA AGGTGTGAGT 2520 

GCACCGATCC CAGGAGCATC GGCCGCTTCT GTGAACACTG CCCCACCTGT TATACAGCCT 2580 

GCAAGGAAAA CTGGAATTGT ATGCAATGCC TTCACCCTCA CAATTTGTCT CAGGCTATAC 2640 

TTGATCAGTG CAAAACCTCA TGTGCTCTCA TGGAACAACA GCATTATGTC GACCAAACTT 2700 

CAGAATGTTT CTCCAGCCCA AGCTACTTGA GAATATTTTT CATCATTTTC ATAGTTACAT 2760 

TCTTGATTGG GTTGCTTAAA GTCCTGATCA TTAGACAGGT GATACTACAA TGGAATAGTA 2820 

ATAAAATTAA GTCCTCATCA GATTACAGAG TGTCAGCCTC AAAAAAGGAT AAGTTGATTC 2880 

TGCAAAGTGT TTGCACAAGA GCAGTCACCT ACCGACGTGA GAAGCCTGAA GAAATAAAAA 2940 

TGGATATCAG CAAATTAAAT GCTCATGAAA CTTTCAGGTG CAACTTCTAA AAA AAGAT TT 3000 

TTAAACACTT AATGGGAAAC TGGAATTGTT AATAATTGCT CCTAAAGATT ATAATTTTAA 3060 

AAGTCACAGG AGGAGACAAA TTGCTCACGG TCATGCCAGT TGCTGGTTGT ACACTCGAAC 3120 

GAAGACTGAC AAGTATCCTC ATCATGATGT 'GACTCACATA GCTGCTGACT TTTTCAGAGA 3180 

AAAATGTGTC TTACTACTGT TTGAGACTAG TGTCGTTGTA GCACTTTACT GTAATATATA 3240 

ACTTATTTAG ATCAGCATAG AATGTAGATC CTCTGAAGAG CACTGATTAC ACTTTACAGG 3300 

TACCTGTTAT CCCTACGCTT CCCAGAGAGA ACAATGCTGT GAGAGAGTTT AGCATTGTGT 3360 

CACTACAAGG GTACAGTAAT CCCTGCACTG GACATGTGAG GAAAAAAATA ATCTGGCAAG 3420 

TATATTCTAA GGTTGCCAAA CACTTCAACA GTTGGTGGTT GAATAGACAA GAACAGCTAG 3480 

ATGAATAAAT GATTCGTGTT TCACTCTTTC AAGAGGTGAA CAGATACAAC CTTAATCTTA 3540 

AAAGATTATT GCTTTTTAAA GTGTGTAGTT TTATGCATGT GTGTTTATGG TTTGCTTATT 3600 

TTTGCAAGAT GGATACTAAT TCCAGCATTC TCTCCTCTTT GCCTTTATGT TTTGTTTTCT 3660 

TTTTTACAGG ATAAGTTTAT GTATGTCACA GATGACTGGA TTAATTAAGT GCTAAGTTAC 3720 

TACTGCCATA AAAAACTAAT AATACAATGT CACTTTATCA GAATACTAGT TTTAAAAGCT 3780 
GAATGTTAA 

Seq ID NO: 645 Protein sequence 
Protein Accession #: NP_002205 

1 11 21 31 41 51 

I I I I I I 

MCGSALAPPT AAFVCLQNDR RGPASFLWAA WVFSLVLGLG QGEDNRCASS NAASCARCLA 60 

LGPECGWCVQ EDFISGGSRS ERCDIVSNLI SKGCSVDSIE YPSVHVIIPT ENEINTQVTP 120 

GEVSIQLRPG AEANFMLKVH PLKKYPVDLY YLVDVSASMH KNIEKLNSVG NDLSRKMAFF 180 

SRDFRLGFGS YVDKTVSPYI SIHPERIHNQ CSDYNLDCMP PHGYIHVLSL TENITEPEKA 240 

VHRQKISGNI DTPEGGFDAM LQAAVCESHI GWRKEAKRLL LVMTDQTSHL ALDSKLAGIV 300 

VPNDGNCHLK MNVYVKSTTM EHPSLGQLSE KLIDNNINVI FAVQGKQFHW YKDLLPLLPG 360 

TIAGEIESKA ANLNNLWEA YQKLISEVKV QVENQVQGIY FNITAICPDG SRKPGMEGCR 420 

NVTSNDEVIiF NVTVTMKKCD VTGGKNYAII KPIGPNETAK IHIHRNCSCQ CEDNRGPKGK 480 

CVDETFLDSK CFQCDENKCH FDEDQFSSES CKSHKDQPVC SGRGVCVCGK CSCHKIKLGK 540 

VYGKYCEKDD FSCPYHHGNL CAGHGECEAG RCQCFSGWEG DRCQCPSAAA QHCVNSKGQV 600 

CSGRGTCVCG RCECTDPRSI GRFCEHCPTC YTACKENWNC MQCLHPHNLS QAILDQCKTS 660 

CALMEQQHYV DQTSECPSSP SYIiRIFFIIF IVTFLIGLLK VLIIRQVILQ WNSNKIKSSS 720 
DYRVSASKKD KLILQSVCTR AVTYRREKPE EIKMDISKLN AHETFRCNF 

Seq ID NO: 646 DNA sequence 

Nucleic Acid Accession 8: NM_003318.1 

Coding sequence: 1..2574 

1 11 21 31 41 51 

I I I I I I 

ATGGAATCCG AGGATTTAAG TGGCAGAGAA TTGACAATTG ATTCCATAAT GAACAAAGTG 60 

AGAGACATTA AAAATAAGTT TAAAAATGAA GACCTTACTG ATGAACTAAG CTTGAATAAA 120 

ATTTCTGCTG ATACTACAGA TAACTCGGGA ACTGTTAACC AAATTATGAT GATGGCAAAC 180 

AACCCAGAGG ACTGGTTGAG TTTGTTGCTC AAACTAGAGA AAAACAGTGT TCCGCTAAGT 240 

GATGCTCTTT TAAATAAATT GATTGGTCGT TACAGTCAAG CAATTGAAGC GCTTCCCCCA 300 

GATAAATATG GCCAAAATGA GAGTTTTGCT AGAATTCAAG TGAGATTTGC TGAATTAAAA 360 

GCTATTCAAG AGCCAGATGA TGCACGTGAC TACTTTCAAA TGGCCAGAGC AAACTGCAAG 420 

AAATTTGCTT TTGTTCATAT ATCTTTTGCA CAATTTGAAC TGTCACAAGG TAATGTCAAA 480 

AAAAGTAAAC AACTTCTTCA AAAAGCTGTA GAACGTGGAG CAGTACCACT AGAAATGCTG 540 

GAAATTGCCC TGCGGAATTT AAACCTCCAA AAAAAGCAGC TGCTTTCAGA GGAGGAAAAG 600 

AAGAATTTAT CAGCATCTAC GGTATTAACT GCCCAAGAAT CATTTTCCGG TTCACTTGGG 660 

CATTTACAGA ATAGGAACAA CAGTTGTGAT TCCAGAGGAC AGACTACTAA AGCCAGGTTT 720 

TTATATGGAG AGAACATGCC ACCACAAGAT GCAGAAATAG GTTACOGGAA TTCATTGAGA 780 

CAAACTAACA AAACTAAACA GTCATGCCCA TTTGGAAGAG TCCCAGTTAA CCTTCTAAAT 840 

AGCCCAGATT GTGATGTGAA GACAGATGAT TCAGTTGTAC CTTGTTTTAT GAAAAGACAA 900 

ACCTCTAGAT CAGAATGCCG AGATTTGGTT GTGCCTGGAT CTAAACCAAG TGGAAATGAT 960 

TCCTGTGAAT TAAGAAATTT AAAGTCTGTT CAAAATAGTC ATTTCAAGGA ACCTCTGGTG 1020 

TCAGATGAAA AGAGTTCTGA ACTTATTATT ACTGATTCAA TAACCCTGAA GAATAAAACG 1080 

GAATCAAGTC TTCTAGCTAA ATTAGAAGAA ACTAAAGAGT ATCAAGAACC AGAGGTTCCA 1140 

GAGAGTAACC AGAAACAGTG GCAATCTAAG AGAAAGTCAG AGTGTATTAA CCAGAATCCT 1200 

GCTGCATCTT CAAATCACTG GCAGATTCCG GAGTTAGCCC GAAAAGTTAA TACAGAGCAG 1260 

AAACATACCA CTTTTGAGCA ACCTGTCTTT TCAGTTTCAA AACAGTCACC ACCAATATCA 1320 

ACATCTAAAT GGTTTGACCC AAAATCTATT TGTAAGACAC CAAGCAGCAA TACCTTGGAT 1380 
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GATTACATGA GCTGTTTTAG AACTCCAGTT GTAAAGAATG ACTTTCCACC TGCTTGTCAG 1440 

TTGTCAACAC CTTATGGCCA ACCTGCCTGT TTCCAGCAGC AACAGCATCA AATACTTGCC 1500 

ACTCCACTTC AAAATTTACA GGTTTTAGCA TCTTCTTCAG CAAATGAATG CATTTCGGTT 1560 

AAAGGAAGAA TTTATTCCAT TTTAAAGCAG ATAGGAAGTG GAGGTTCAAG CAAGGTATTT 1620 

CAGGTGTTAA ATGAAAAGAA ACAGATATAT GCTATAAAAT ATGTGAACTT AGAAGAAGCA 1680 

GATAACCAAA CTCTTGATAG TTACCGGAAC GAAATAGCTT ATTTGAATAA ACTACAACAA 1740 

CACAGTGATA AGATCATCCG ACTTTATGAT TATGAAATCA OGGACCAGTA CATCTACATG 1800 

GTAATGGAGT GTGGAAATAT TGATCTTAAT AGTTGGCTTA AAAAGAAAAA ATCCATTGAT 1860 

CCATGGGAAC GCAAGAGTTA CTGGAAAAAT ATGTTAGAGG CAGTTCACAC AATCCATCAA 1920 

CATGGCATTG TTCACAGTGA TCTTAAACCA GCTAACTTTC TGATAGTTGA TGGAATGCTA 1980 

AAGCTAATTG ATTTTGGGAT TGCAAACCAA ATGCAAOCAG ATACAACAAG TGTTGTTAAA 2040 

GATTCTCAGG TTGGCACAGT TAATTATATG CCACCAGAAG CAATCAAAGA TATGTCTTCC 2100 

TCCAGAGAGA ATGGGAAATC TAAGTCAAAG ATAAGCCCCA AAAGTGATGT TTGGTCCTTA 2160 

GGATGTATTT TGTACTATAT GACTTACGGG AAAACACCAT TTCAGCAGAT AATTAATCAG 2220 

ATTTCTAAAT TACATGCCAT AATTGATCCT AATCATGAAA TTGAATTTCC CGATATTCCA 2280 

GAGAAAGATC TTCAAGATGT GTTAAAGTGT TGTTTAAAAA GGGACCCAAA ACAGAGGATA 2340 

TCCATTCCTG AGCTCCTGGC TCATCCCTAT GTTCAAATTC AAACTCATCC AGTTAACCAA 2400 

ATGGCCAAGG GAACCACTGA AGAAATGAAA TATGTTCTGG GCCAACTTGT TGGTCTGAAT 2460 

TCTCCTAACT CCATTTTGAA AGCTGCTAAA ACTTTATATG AACACTATAG TGGTGGTGAA 2520 
AGTCATAATT CTTCATCCTC CAAGACTTTT GAAAAAAAAA GGGGAAAAAA ATGA 

Seq ID NO j 647 Protein sequence 
Protein Accession NP_003309.1 

1 11 21 31 41 51 

I I I I I I 

MESEDLSGRE LTIDSIMNKV RDIKNKFKNE DLTDELSLNK ISADTTDNSG TVNQIMMMAN 60 

NPEDWLSLLL KLEKNSVPLS DALLNKLIGR YSQAIEALPP DKYGQNESFA RIQVRFAELK 120 

AIQEPDDARD YFQMARANCK KFAFVHISPA QFELSQGNVK KSKQIiLQKAV ERGAVPLEML 180 

EIALRNLNLQ KKQLLSEEEK KNLSASTVLT AQESFSGSLG HLQNRNNSCD SRGQTTKARF 240 

LYGENMPPQD AEIGYRNSLR QTNKTKQSCP FGRVPVNLLN SPDCDVKTDD SWPCFMKRQ 300 

TSRSECRBLV VPGSKPSGND SCELRNLKSV QNSHFKEPLV SDEKSSELII TDSITLKNKT 360 

ESSLLAKLEE TKBYQEPEVP ESNQKQWQSK RKSECINQNP AASSNHWQIP EIiARKVNTEQ 420 

KHTTFEQPVF SVSKQSPPIS TSKWFDPKSI CKTPSSKTLO DYMSCFRTPV VKNDFPPACQ 480 

LSTPYGQPAC FQQQQHQILA TPLQNLQVLA SSSANECISV KGRIYSILKQ IGSGGSSKVF 540 

QVLNEKKQIY AIKYVNLEEA DNQTLDSYRN EIAYLNKLQQ HSDKIIRLYD YEITDQYIYM 600 

VMECGNIDLN SWLKKKKSID PWERKSYWKN MLEAVHTIHQ HGIVHSDLKP ANFLIVDGML 660 

KLIDFGIANQ MQPDTTSWK DSQVGTVNYM PPEAIKDMSS SRENGKSKSK ISPKSDVWSL 720 

GCILYYMTYG KTPFQQIINQ ISKLHAIIDP NHEIEFPDIP EKDLQDVUCC CLKRDPKQRI 780 

SIPEIiLAHPY VQIQTHPVNQ MAKGTTEEMK YVLGQLVGLN SPNSILKAAK TLYEHYSGGE 840 
SHNSSSSKTF EKKRGKK 

Seq ID NO: 648 ONA sequence 
Nucleic Acid Accession 8: NM_015507 
Coding sequence: 241.. 1902 

1 11 21 31 41 51 

I.I I I I I 

CCGCAGAGGA GCCTCGGCCA GGCTAGCCAG GGCGCCCCCA GCCCCTCCCC AGGCCGCGAG 60 

CGCCCCTGCC GCGGTGCCTG GCCTCCCCTC CCAGACTGCA GGGACAGCAC CCGGTAACTG 120 

CGAGTGGAGC GGAGGACCCG AGCGGCTGAG GAGAGAGGAG GCGGCGGCTT AGCTGCTACG 180 

GGGTCCGGCC GGCGCCCTCC CGAGGGGGGC TCAGGAGGAG GAAGGAGGAC CCGTGCGAGA 240 

ATGCCTCTGC CCTGGAGCCT TGCGCTCCCG CTGCTGCTCT CCTGGGTGGC AGGTGGTTTC 300 

GGGAACGCGG CCAGTGCAAG GCATCACGGG TTGTTAGCAT CGGCACGTCA GCCTGGGGTC 360 

TGTCACTATG GAACTAAACT GGCCTGCTGC TACGGCTGGA GAAGAAACAG CAAGGGAGTC 420 

TGTGAAGCTA CATGCGAACC TGGATGTAAG TTTGGTGAGT GCGTGGGACC AAACAAATGC 480 

AGATGCTTTC CAGGATACAC CGGGAAAACC TGCAGTCAAG ATGTGAATGA GTGTGGAATG 540 

AAACCCCGGC CATGCCAACA CAGATGTGTG AATACACACG GAAGCTACAA GTGCTTTTGC 600 

CTCAGTGGCC ACATGCTCAT GCCAGATGCT ACGTGTGTGA ACTCTAGGAC ATGTGCCATG 660 

ATAAACTGTC AGTACAGCTG TGAAGACACA GAAGAAGGGC CACAGTGCCT GTGTCCATCC 720 

TCAGGACTCC GCCTGGCCCC AAATGGAAGA GACTGTCTAG ATATTGATGA ATGTGCCTCT 780 

GGTAAAGTCA TCTGTCCCTA CAATCGAAGA TGTGTGAACA CATTTGGAAG CTACTACTGC . 840 

AAATGTCACA TTGGTTTCGA ACTGCAATAT ATCAGTGGAC GATATGACTG TATAGATATA 900 

AATGAATGTA CTATGGATAG CCATACGTGC AGCCACCATG CCAATTGCTT CAATACCCAA 960 

GGGTCCTTCA AGTGTAAATG CAAGCAGGGA TATAAAGGCA ATGGACTTCG GTGTTCTGCT 1020 

ATCCCTGAAA ATTCTGTGAA GGAAGTCCTC AGAGCACCTG GTACCATCAA AGACAGAATC 1080 

AAGAAGTTGC TTGCTCACAA AAACAGCATG AAAAAGAAGG CAAAAATTAA AAATGTTACC 1140 

CCAGAACCCA CCAGGACTCC TACCCCTAAG GTGAACTTGC AGCCCTTCAA CTATGAAGAG 1200 

ATAGTTTCCA GAGGCGGGAA CTCTCATGGA GGTAAAAAAG GGAATGAAGA GAAAATGAAA 1260 

GAGGGGCTTG AGGATGAGAA AAGAGAAGAG AAAGCCCTGA AGAATGACAT AGAGGAGCGA 1320 

AGCCTGCGAG GAGATGTGTT TTTCCCTAAG GTGAATGAAG CAGGTGAATT CGGCCTGATT 1380 

CTGGTCCAAA GGAAAGCGCT AACTTCCAAA CTGGAACATA AAGATTTAAA TATCTCGGTT 1440 

GACTGCAGCT TCAATCATGG GATCTGTGAC TGGAAACAGG ATAGAGAAGA TGATTTTGAC 1500 

TGGAATCCTG CTGATCGAGA TAATGCTATT GGCTTCTATA TGGCAGTTCC GGCCTTGGCA 1560 

GGTCACAAGA AAGACATTGG CCGATTGAAA CTTCTCCTAC CTGACCTGCA ACCCCAAAGC 1620 

AACTTCTGTT TGCTCTTTGA TTACCGGCTG GCCGGAGACA AAGTCGGGAA ACTTCGAGTG 1680 

TTTGTGAAAA ACAGTAACAA TGCCCTGGCA TGGGAGAAGA CCACGAGTGA GGATGAAAAG 1740 

TGGAAGACAG GGAAAATTCA GTTGTATCAA GGAACTGATG CTACCAAAAG CATCATTTTT 18O0 

GAAGCAGAAC GTGGCAAGGG CAAAACCGGC GAAATCGCAG TGGATGGCGT CTTGCTTGTT 1860 

TCAGGCTTAT GTCCAGATAG CCTTTTATCT GTGGATGACT GAATGTTACT ATCTTTATAT 1920 

TTGACTTTGr ATGTCAGTTC CCTGGTTTTT TTGATATTGC ATCATAGGAC CTCTGGCATT 1980 

TTAGAATTAC TAGCTGAAAA ATTGTAATGT ACCAACAGAA ATATTATTGT AAGATGCCTT 2040 

TCTTGTATAA GATATGCCAA TATTTGCTTT AAATATCATA TCACTGTATC TTCTCAGTCA 2100 

TTTCTGAATC TTTCCACATT ATATTATAAA ATATGGAAAT GTCAGTTTAT CTCCCCTCCT 2160 

CAGTATATCT GATTTGTATA AGTAAGTTGA TGAGCTTCTC TCTACAACAT TTCTAGAAAA 2220 

TAGAAAAAAA AGCACAGAGA AATGTTTAAC TGTTTGACTC TTATGATACT TCTTGGAAAC 2280 

TATGACATCA AAGATAGACT TTTGCCTAAG TGGCTTAGCT GGGTCTTTCA TAGCCAAACT 2340 
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TGTATATTTA AATTCTTTGT AATAATAATA TCCAAATCAT CAAAAAAAAA AAAAAAAA 



PCT/US02/12476 
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Seq ID NO i 649 Protein sequence 
Protein Accession NP_056322 



MPLPWSLALP 
CEATCEPGCK 
LSGHMLMPDA 
GKVICPYNRR 
GSFKCKCKQG 
PEPTRTPTPK 
SLRGDVFFPK 
WNPADRDNAI 
FVKNSNNALA 
SGLCPDSLLS 



11 
I 

LLLSWVAGGF 
FGECVGPNKC 
TCVNSRTCAM 
CVNTFGSYYC 
YKGNGLRCSA 
VNLQPFNYEE 
VNEAGEFGLI 
GFYMAVPALA 
WEKTTSEDEK 
VDD 



21 

I 

GNAASARHHG 
RCFPGYTGKT 
INCQYSCEDT 
KCHIGFELQY 
IPENSVKEVL 
IVSRGGNSRG 
LVQRKALTSK 
GHKKDIGRLK 
WKTGKIQLYQ 



31 
I 

LLASARQPGV 
CSQDVNECGM 
BEGPQCLCPS 
ISGRYDCIDI 
RAPGTIKDRI 
GKKGNSEKMK 
LEHKDuNISV 
LLLPDLQPQS 
GTDATKSIIF 



41 
I 

CHYGTKLACC 
KPRPCQHRCV 
SGLRIAPNGR 
NECTMDSHTC 
KKLLAHKNSM 
EGLEDEXREE 
DCSFNHGICD 
NFCLLFDYRL 
EAERGKGKTG 



51 
I 

YGWRRNSKGV 
NTHGSYKCFC 
DCLD1DECAS 
SHHANCFNTQ 
KKKAKIKNVT 
KALXNDIEER 
WKQDREDDFD 
AGDKVGKLRV 
EIAVDGVLIiV 



Seq ID NO: 650 DNA sequence 

Nucleic Acid Accession #: NM_0O3506.l 

Coding sequence: 259.. 2379 



GCAGCTCCAG 
TTAGACGGGG 
CTCATTTTCA 
ATCTTTGGAT 
ATCAGGAATT 
CTCCTAAGAG 
ATGGCCTACA 
GCGGTGGAAA 
ACTTTCCTCT 
TGTCGTAAAC 
ATCCGATGGC 
GTAACTTTTG 
AGAGACATTG 
TTTCTGGGAA 
CTAGAGTTTG 
TTCACATTCC 
ATATATTACT 
GGCGATAGCA 
CTAGGCTCTC 
GCTGGCACTG 
TGGAGTTGTG 
CCAGGTTTCC 
GGAGTTTGCT 
CTGTGCCTTT 
CATGTTCGAC 
ATTCGAATTG 
TACGTCTATG 
CGTCAGTACC 
TTATTTATGA 
GGAAGCAAAA 
CCAATCAGTG 
TCTAAAGTTA 
TCCAAATCCA 
ATTACTAGCC 
ACATCAATGA 
TGTGGTGAAC 
GGGAAGGGCC 
AAGAGTGATA 
TCAGAACCAA 
AGAAAAGAGC 
CAGAAGCAAA 
TACGTTCTTC 
TCAAGAATAA 
AAATGTGCAG 
CCTTTTCTAT 
TTTACCTTTT 
GTATCTTTTT 
ATTCAAGTAT 
ATTTCTAAGA 
GTCTTATAAT 
TGTGATTTTT 
GGTGCTTACT 
ATATTTAAAA 
GGCCAAGTGC 
AACCACTTAC 
TACATTTTGT 



11 
I 

TCCCGGACGC 
ACGGGAAGGG 
GGAAAGCCTG 
GGGGATCTTC 
TGAAGAAAAT 
GGCACAGTCT 
ACATGACGTT 
TGGAGCATTT 
GCAAAGCATT 
TTTGTGAGAA 
CTGAGGAGCT 
ATCCACACAC 
GATTTTGGTG 
TTGACCAGTG 
CAAAAAGTTT 
TTACTTTTTT 
CTGTCTGTTA 
CAGCCTGCAA 
AAAATAAGGC 
TGTGGTGGGT 
AAGCCATCGA 
TGACTGTTAT 
TTGTTGGCCT 
GTGTGTTTGT 
AAGTCATACA 
GAGTCTTCAG 
AGCAAGTGAA 
ATATCCCATG 
TAAAATACCT 
AGACATGCAC 
AAAGTCGAAG 
AACACAAAAA 
TGGGAACCAG 
ATGATTACCT 
GAGAGGTGAA 
CTGCCTCGCC 
AGGCAGGCAG 
TTACTGACAC 
GCAGCCTCAA 
AGGGAGGTGG 
TTTGTGTTAC 
TTTTGCACTT 
TATGACTCAT 
GTTAATAATA 
TTATGAAGAT 
TGATATAAAA 
ATACATATTT 
TTTTATCATG 
AAATTGTAAA 
AGGAATTTAA 
ATAGTCTCGT 
CAAAGAGTGT 
TAAAATGTCC 
AATTGACTTC 
AGTTGCTTAT 
ATTATACAGT 



21 

I 

AACCCCGGAG 
ACAGCGGCCT 
AAAATGAGTA 
TGAGGATGCA 
GGAGATGTTT 
CTTCACCTGT 
TTTCCCTAAT 
TCTTCCTCTC 
TGTACCAACC 
AGTATATTCT 
TGAATGTGAC 
AGAATTTCTT 
TCCAAGGCAT 
TGCGCCTCCA 
TATTGGAACA 
AATTGATGTT 
CAGCATTGTA 
TAAGGCAGAT 
TTGCACCGTT 
GATTCTTACC 
GCAAAAAGCA 
GCTTCTTGCT 
TTATGACCTG 
TGGGCTCTCT 
ACATGATGGC 
CGGCTTGTAT 
CAGGATTACC 
TCCTTATCAG 
GATGACATTA 
AGAATGGGCT 
AGTACTACAG 
GAAGCACTAT 
CACAGGAGCT 
AGGACAAGAA 
AGCGGACGGA 
AGCAGCATCC 
TGTATCTGAA 
TGGCCTGGCA 
AGGTTCCACA 
TTGTCATTCA 
ACTGGAAGTG 
AAAGTTGCAT 

TTCACACAAA 
TTTTTTTAAT 

TCTACTCTTG 
TCAAGATATT 
GAAAATAAGC 
CTATTGTGAT 
ATAGTCTTCT 
CTTTAAAAAC 
TTTAGGAATT 
CCACTATTGA 
TAAAGGGTTA 
CCTTTTTTAA 
ATTTTTTGTT 
ACCTTTCTCA 



31 
I 

CCGTCTCAGG 
TCGACCGCCC 
AAATAGTGAA 
AAGAGTGATT 
ACATTTTTGT 
GAACCAATTA 
CTGATGGGTC 
GCAAATCTGG 
TGCATAGAAC 
GATTGCAAAA 
AGATTACAAT 
GGTCCTCAGA 
CTTAAGACTT 
TGCCCCAACA 
GTTTCAATAT 
AGAAGATTCA 
TCTCTTATGT 
GAGAAGCTAG 
TTGTTCATGC 
ATTACTTGGT 
GTGTGGTTTC 
CTGAACAAAG 
GATGCTTCTC 
CTTCTTTTAG 
CGGAACCAAG 
CTTGTGCCAT 
TGGGAGATAA 
GCAAAAGCAA 
ATTGTTGGCA 
GGGTTTTTTA 
GAATCATGTG 
AAACCAAGTT 
ACAGCAAATC 
ACTTTGACAG 
GCTAGCACCC 
ATCTCCAGAC 
AGTGCGCGGA 
CAGAGCAACA 
TCTCTGCTTG 
GATACTTGAA 
ACCTATGCAC 
TGCCTACTGT 
GGTTAATGAC 
AGTGTGGGAG 
GTAAGAGTAT 
TCTTTGCTGA 
TTATATGTAT 
ATTTTAGCAC 
TTTATACTGT 
CCACTTATTG 
TCACAGATCT 
TTGTATTATG 
GTAGACAAAA 
TGTTTCATGA 
TTAACTTTTG 
GACATTTTGT 



41 

I 

TCCCTGGGGG 
CCCGAGTAAT 
ATGAGGAATT 
CATCCAAGCC 
TGACGTGTAT 
CTGTTCCCAG 
ATTATGACCA 
AATGTTCACC 
AAATTCATGT 
AATTAATTGA 
ACTGTGATGA 
AGAAAACAGA 
CTGGGGGACA 
TGTATTTTAA 
TTTGTCTTTG 
GATACCCAGA 
ACTTCATTGG 
AACTTGGTGA 
TTTTGTATTT 
TCTTAGCTGC 
ATGCTGTTGC 
TTGAAGGAGA 
GCTACTTTGT 
CTGGCATTAT 
AAAAACTAAA 
TAGTGACACT 
CTTGGGTCTC 
AAGCTCGACC 
TCTCTGCTGT 
AACGAAATCG 
AGTTTTTCTT 
CACACAAGCT 
ATGGCACTTC 
AAATCCAAAC 
CCAGGTTAAG 
TCTCTGGGGA 
GTGAAGGAAG 
ATTTGCAGGT 
TTCACCCAGT 
GAACATTTTC 
TGTTTTGTAA 
TATACTGGAA 
AACAATATAC 
GACAGAGTTA 
TTTAAGATGT 
AGTATTTAAA 
TTGAACTTTT 
TTTGGTAGCT 
AAAAAAAGAT 
ATACCTTACC 
AAATTATGTA 
CTGCTCACTG 
TGTTAGTCTT 
CCACCCATTG 
TTTCTTAACA 
AG 



51 
I 

GAACGGTGGG 
TGACCCAGGA 
TGAACATTTT 
ATGTGGTAAA 
TTTTCTACCC 
ATGTATGAAA 
GAGTATTGCC 
AAACATTGAA 
GGTTCCACCT 
CACTTTTGGG 
GACTGTTCCT 
ACAAGTCCAA 
AGGATATAAG 
AAGTGATGAG 
TGCAACTCTG 
GAGACCAATT 
ATTTTTGCTG 
CACTGTTGTC 
TTTCACAATG 
AGGAAGAAAA 
ATGGGGAACA 
CAACATTAGT 
ACTCTTGCCA 
TTCCTTAAAT 
GAAATTTATG 
TCTCGGATGT 
TGATCATTGT 
AGAATTGGCT 
CTTCTGGGTT 
CAAGAGAGAT 
AAAGCACAAT 
GAAGGTCATT 
TGCAGTAGCA 
CTCACCAGAA 
AGAACAGGAC 
ACAGGTCGAC 
GATTAGTCCA 
CCCCAGTTCT 
TTCAGGAGTG 
TCTCGTTACT 
GAATCACTGT 
AAAATAGAGT 
CTGAAAACAG 
GAGGAATCTT 
ACTATGCTAT 
TCTTATCCTT 
TTGAAATCCT 
TTTACACTGA 
ATACCAAAAA 
ATCTAAAATG 
ACTGAAATAA 
ATCCTTCTGC 
TTGTATATTA 
ATTGTATTAT 
TTTAGAATAT 



60 
120 
180 
240 
300 
360 
420 
460 
540 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2260 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 



Seq ID NO: 651 Protein sequence 
Protein Accession 8: NP 003497.1 



11 



21 
I 



31 
I 
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51 
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MEMFTFLLTC IFLPLLRGHS LFTCEPITVP RCMKMAYNMT FFPNLKGHYD QSIAAVEMEH 60 

FLPLANLECS PNIETFLCKA PVPTCIEQIH WPPCRKLCB KVYSDCKKLI DTFGIRWPBE 120 

LSCDRLQYCD ETVPVTFDPH TEFLGPQKKT EQVQRDIGPW CPRHliKTSGG QGYKFLGIDQ 1BO 

CAPPCPNMYF KSDKLBFAKS FIGTVSIFCL CATIiFTFLTF LIDVRRFRYP ERPIIYYSVC 240 

YSIVSLMYFI GFLLGDSTAC NKADEKLELG DTWLGSQNK ACTVLFMLLY FFTMAGTVWW 300 

VILTITWPLA AGRKWSCEAI EQKAVWFHAV AWGTPGFLTV MLLALNKVEG DNISGVCFVG 360 

LYDLDASRYF VLLPLCLCVF VGLSLLLAGI ISLNHVRQVI QHDGRNQEKL KKFMIRIGVF 420 

SGLYLVPLVT LLGCYVYEQV NRITWEITWV SDHCRQYHIP CPYQAKAKAR PELALFMIKY 480 

LMTLIVGISA VFWVGSKKTC TEWAGFFKRN RKRDPISESR RVLQESCEFF LKHNSKVKHK 540 

KKHYKPSSHK LKVISKSMGT STGATANHGT SAVAITSHDY LGQETLTEIQ TSPETSMREV 600 

KADGASTPRL REQDCGEPAS PAASISRLSG EQVDGXGQAG SVSESARSEG RISPKSDITD 660 
TGLAQSNNLQ VPSSSEPSSL. KGSTSLLVHP VSGVRKEQGG GCHSDT 

Seq ID NO: 652 DNA sequence 

Nucleic Acid Accession #: NMJU4791.1 

Coding sequence : 171 . . 2126 

1 11 21 31 41 51 

111)11 

TTGGCGGGCG GAAGCGGCCA CAACCCGGCG ATCGAAAAGA TTCTTAGGAA CGCCGTACCA 60 

GCCGCGTCTC TCAGGACAGC AGGCCCCTGT CCTTCTGTCG GGCGCCGCTC AGCCGTGCCC 120 

TCCGCCCCTC AGGTTCTTTT TCTAATTCCA AATAAACTTG CAAGAGGACT ATGAAAGATT 180 

ATGATGAACT TCTCAAATAT TATGAATTAC ATGAAACTAT TGGGACAGGT GGCTTTGCAA 240 

AGGTCAAACT TGCCTGCCAT ATCCTTACTG GAGAGATGGT AGCTATAAAA ATCATGGATA 300 

AAAACACACT AGGGAGTGAT TTGCCCCGGA TCAAAACGGA GATTGAGGCC TTGAAGAACC 360 

TGAGACATCA GCATATATGT CAACTCTACC ATGTGCTAGA GACAGCGAAC AAAATATTCA 420 

TGGTTCTTGA GTACTGCCCT GGAGGAGAGC TGTTTGACTA TATAATTTCC CAGGATCGCC 480 

TGTCAGAAGA GGAGACCCGG GTTGTCTTCC GTCAGATAGT ATCTGCTGTT GCTTATGTGC 540 

ACAGCCAGGG CTATGCTCAC AGGGACCTCA AGCCAGAAAA TTTGCTGTTT GATGAATATC 600 

ATAAATTAAA GCTGATTGAC TTTGGTCTCT GTGCAAAACC CAAGGGTAAC AAGGATTACC 660 

ATCTACAGAC ATGCTGTGGG AGTCTGGCTT ATGCAGCACC TGAGTTAATA CAAGGCAAAT 720 

CATATCTTGG ATCAGAGGCA GATGTTTGGA GCATGGGCAT ACTGTTATAT GTTCTTATGT 780 

GTGGATTTCT ACCATTTGAT GATGATAATG TAATGGCTTT ATACAAGAAG ATTATGAGAG 840 

GAAAATATGA TGTTCCCAAG TGGCTCTCTC CCAGTAGCAT TCTGCTTCTT CAACAAATGC 900 

TGCAGGTGGA CCCAAAGAAA CGGATTTCTA TGAAAAATCT ATTGAACCAT CCCTGGATCA 960 

TGCAAGATTA CAACTATCCT GTTGAGTGGC AAAGCAAGAA TCCTTTTATT CACCTCGATG 1020 

ATGATTGCGT AACAGAACTT TCTGTACATC ACAGAAACAA CAGGCAAACA ATGGAGGATT 1080 

TAATTTCACT GTGGCAGTAT GATCACCTCA CGGCTACCTA TCTTCTGCTT CTAGCCAAGA 1140 

AGGCTCGGGG AAAACCAGTT CGTTTAAGGC TTTCTTCTTT CTCCTGTGGA CAAGCCAGTG 1200 

CTACCCCATT CACAGACATC AAGTCAAATA ATTGGAGTCT GG AAGATGTG ACCGCAAGTG 1260 

ATAAAAATTA TGTGGCGGGA TTAATAGACT ATGATTGGTG TGAAGATGAT TTATCAACAG 1320 

GTGCTGCTAC TCCCCGAACA TCACAGTTTA CCAAGTACTG GACAGAATCA AATGGGGTGG 1380 

AATCTAAATC ATTAACTCCA GCCTTATGCA GAACACCTGC AAATAAATTA AAGAACAAAG 1440 

AAAATGTATA TACTCCTAAG TCTGCTGTAA AGAATGAAGA GTACTTTATG TTTCCTGAGC 1500 

CAAAGACTCC AGTTAATAAG AACCAGCATA AGAGAGAAAT ACTCACTACG CCAAATCGTT 1560 

ACACTACACC CTCAAAAGCT AGAAACCAGT GCCTGAAAGA AACTCCAATT AAAATACCAG 1620 

TAAATTCAAC AGGAACAGAC AAGTTAATGA CAGGTGTCAT TAGCCCTGAG AGGCGGTGCC 1680 

GCTCAGTGGA ATTGGATCTC AACCAAGCAC ATATGGAGGA GACTCCAAAA AGAAAGGGAG 1740 

CCAAAGTGTT TGGGAGCCTT GAAAGGGGGT TGGATAAGGT TATCACTGTG CTCACCAGGA 1800 

GCAAAAGGAA GGGTTCTGCC AGAGACGGGC CCAGAAGACT AAAGCTTCAC TATAATGTGA 1860 

CTACAACTAG ATTAGTGAAT CCAGATCAAC TGTTGAATGA AATAATGTCT ATTCTTCCAA 1920 

AGAAGCATGT TGACTTTGTA CAAAAGGGTT ATACACTGAA GTGTCAAACA CAGTCAGATT 1980 

TTGGGAAAGT GACAATGCAA TTTGAATTAG AAGTGTGCCA GCTTCAAAAA CCCGATGTGG 2040 

TGGGTATCAG GAGGCAGCGG CTTAAGGGOG ATGCCTGGGT TTACAAAAGA TTAGTGGAAG 2100 

ACATCCTATC TAGCTGCAAG GTATAATTGA TGGATTCTTC CATCCTGCCG GATGAGTGTG 2160 

GGTGTGATAC AGCCTACATA AAGACTGTTA TGATCGCTTT GATTTTAAAG TTCATTGGAA 2220 

CTACCAACTT GTTTCTAAAG AGCTATCTTA AGACCAATAT CTCTTTGTTT TTAAACAAAA 2280 

GATAT7ATTT TGTGTATGAA TCTAAATCAA GCCCATCTGT CATTATGTTA CTGTCTTTTT 2340 

TAATCATGTG GTTTTGTATA TTAATAATTG TTGACTTTCT TAGATTCACT TCCATATGTG 2400 

AATGTAAGCT CTTAACTATG TCTCTTTGTA ATGTGTAATT TCTTTCTGAA ATAAAACCAT 2460 
TTGTGAATAT 

Seq ID NO: 653 Protein sequence 
Protein Accession #: NP_055606.1 

1 11 21 31 41 51 

I I I I I I 

MKDYDELLKY YELHETIGTG GFAKVKLACH ILTGEMVAIK IMDKNTLGSD LPRIKTEIEA 60 

LKNLRHQHIC QLYHVLETAN KIFMVLEYCP GGELFDYIIS QDRLSEEETR WFRQIVSAV 120 

AYVHSQGYAH RDLKPENLLF DEYHKLKLID FGLCAKPKGN KDYHLQTCCG SLAYAAPELI 180 

QGKSYLGSEA DVWSMGILLY VLMCGFLPFD DDNVMALYKK IMRGKYDVPK WLSPSSILLL 240 

QQMLQVDPKK RISMKNLLNH PWIMQDYNYP VEWQSKNPFI HLDDDCVTEL SVHHRNNRQT 300 

MEDLISLWQY DHLTATYLLL LAKKARGKPV RLRLSSFSCG QASATPFTDI KSNNWSLEDV 360 

TASDKNYVAG LXDYDWCEDD LSTGAATPRT SQFTKYWTES NGVESKSLTP ALCRTPANKL 420 

KNKENVYTPK SAVKNEEYFM FPEPKTPVNK NQHKREIliTT PNRYTTPSKA RNQCLKETPI 480 

KIPVNSTGTD KLMTGVISPE RRCRSVELDL NQAHMEETPK RKGAKVFGSL ERGLDKVITV 540 

LTRSKRKGSA RDGPRRLKLH YNVTTTRLVN PDQLLNEIMS ILPKKHVDFV QKGYTLKCQT 600 
QSDFGKVTKQ FELEVCQLQK PDWGIRRQR LKGDAWVYKR LVEDILSSCK V 

Seq ID NO: 654 DNA sequence 
Nucleic Acid Accession #: NM_000582 
Coding sequence: 88.. 990 

1 11 21 31 41 51 

I I I I I I 

GCAGAGCACA GCATCGTCGG GACCAGACTC GTCTCAGGCC AGTTGCAGCC TTCTCAGCCA 60 

AACGCCGACC AAGGAAAACT CACTACCATG AGAATTGCAG TGATTTGCTT TTGCCTCCTA 120 



436 



WO 02/086443 

GGCATCACCT GTGCCATACC AGTTAAACAG GCTGATTCTG GAAGTTCTGA GGAAAAGCAG 180 

CTTTACAACA AATACCCAGA TGCTGTGGCC ACATGGCTAA ACCCTGACCC ATCTCAGAAG 240 

CAGAATCTCC TAGCCCCACA GACCCTTCCA AGTAAGTCCA ACGAAAGCCA TGACCACATG 300 

GATGATATGG ATGATGAAGA TGATGATGAC CATGTGGACA GCCAGGACTC CATTGACTCG 360 

AACGACTCTG ATGATGTAGA TGACACTGAT GATTCTCACC AGTCTGATGA GTCTCACCAT 420 

TCTGATGAAT CTGATGAACT GGTCACTGAT TTTCCCACGG ACCTGCCAGC AACCGAAGTT 480 

TTCACTCCAG TTGTCCCCAC AGTAGACACA TATGATGGCC GAGGTGATAG TGTGGTTTAT 540 

GGACTGAGGT CAAAATCTAA GAAGTTTCGC AGACCTGACA TCCAGTACCC TGATGCTACA 600 

GACGAGGACA TCACCTCACA CATGGAAAGC GAGGAGTTGA ATGGTGCATA CAAGGCCATC 660 

CCCGTTGCCC AGGACCTGAA CGCGCCTTCT GATTGGGACA GCCGTGGGAA GGACAGTTAT 720 

GAAACGAGTC AGCTGGATGA CCAGAGTGCT GAAACCCACA GCCACAAGCA GTCCAGATTA 780 

TATAAGOGGA AAGCCAATGA TGAGAGCAAT GAGCATTCCG ATGTGATTGA TAGTCAGGAA 840 

CTTTCCAAAG TCAGCCGTGA ATTCCACAGC CATGAATTTC ACAGCCATGA AGATATGCTG 900 

GTTGTAGACC CCAAAAGTAA GGAAGAAGAT AAACACCTGA AATTTCGTAT TTCTCATGAA 960 

TTAGATAGTG CATCTTCTGA GGTCAATTAA AAGGAGAAAA AATACAATTT CTCACTTTGC 1020 

ATTTAGTCAA AAGAAAAAAT GCTTTATAGC AAAATGAAAG AGAACATGAA ATGCTTCTTT 1080 

CTCAGTTTAT TGGTTGAATG TGTATCTATT TGAGTCTGGA AATAACTAAT GTGTTTGATA 1140 

ATTAGTTTAG TTTGTGGCTT CATGGAAACT CCCTGTAAAC TAAAAGCTTC AGGGTTATGT 1200 

CTATGTTCAT TCTATAGAAG AAATGCAAAC TATCACTGTA TTTTAATATT TGTTATTCTC 1260 

TCATGAATAG AAATTTATGT AGAAGCAAAC AAAATACTTT TACCCACTTA AAAAGAGAAT 1320 

ATAACATTTT ATGTCACTAT AATCTTTTGT TTTTTAAGTT AGTGTATATT TTGTTGTGAT 1380 

TATCTTTTTG TGGTGTGAAT AAATCTTTTA TCTTGAATGT AATAAGAATT TGGTGGTGTC 1440 

AATTGCTTAT TTGTTTTCCC ACGGTTGTCC AGCAATTAAT AAAACATAAC CTTTTTTACT 1500 
GCCTAAAAAA AAAAAAAAAA AAAA 

Seq ID NO i 655 Protein sequence 
Protein Accession #: NP_000573 

1 11 21 31 41 51 

I I I I I I 

MRIAVICFCL LGITCAIPVK QADSGSSEEK QLYNKYPDAV ATWLNPDPSQ KQNLLAPQTIi 60 

PSKSNESHDH MDDMDDEDDD DHVDSQDSID SNDSDDVDDT DDSHQSDESH HSDESDELVT 120 

DFPTDLPATE VFTPWPTVD TYDGRGDSW YGLRSKSKKF RRPDIQYPDA TDEDITSHME 180 

SEELNGAYKA IPVAQDLNAP SDWDSRGKDS YETSQLDDQS AETHSHKQSR LYKRKANDES 240 
NEHSDVIDSQ ELSKVSREFH SHEPHSHEDM LWDPKSKEE DKHLKFRISH ELDSASSEVN 

Seq ID NO t 656 DNA sequence 

Nucleic Acid Accession #: NM_003108.l 

Coding sequence: 76.. 1401 

1 11 21 31 41 51 

I I I I I I 

GGGGTGGGAG GGGGAGGGGG ACCTCCGCAC GAGACCCAGC GGCCCGGGTT GGAGCGTCCA 60 

GCCCTGCAAC GGATCATGGT GCAGCAGGCG GAGAGCTTGG AAGCGGAGAG CAACCTGCCC 120 

CGGGAGGCGC TGGACACGGA GGAGGGCGAA TTCATGGCTT GCAGCCCGGT GGCCCTGGAC 180 

GAGAGCGACC CAGACTGGTG CAAGACGGCG TCGGGCCACA TCAAGCGGCC GATGAACGCG 240 

TTCATGGTAT GGTCCAAGAT CGAACGCAGG AAGATCATGG AGCAGTCTCC GGACATGCAC 300 

AACGCCGAGA TCTCCAAGAG GCTGGGCAAG CGCTGGAAAA TGCTGAAGGA CAGCGAGAAG 360 

ATCCCGTTCA TCCGGGAGGC GGAGCGGCTG CGGCTCAAGC ACATGGCCGA CTACCCCGAC 420 

TACAAGTACC GGCCCCGGAA AAAGCCCAAA ATGGACCCCT CGGCCAAGCC CAGCGCCAGC 480 

CAGAGCCCAG AGAAGAGCGC GGCCGGCGGC GGCGGCGGGA GCGCGGGCGG AGGCGCGGGC 540 

GGTGCCAAGA CCTCCAAGGG CTCCAGCAAG AAATGCGGCA AGCTCAAGGC CCCCGCGGCC 600 

GCGGGCGCCA AGGCGGGCGC GGGCAAGGCG GCCCAGTCCG GGGACTACGG GGGCGCGGGC 660 

GACGACTACG TGCTGGGCAG CCTGCGCGTG AGCGGCTCGG GCGGCGGCGG CGCGGGCAAG 720 

ACGGTCAAGT GCGTGTTTCT GGATGAGGAC GACGACGAOG ACGACGACGA CGACGAGCTG 780 

CAGCTGCAGA TCAAACAGGA GCCGGACGAG GAGGACGAGG AACCACCGCA CCAGCAGCTC 840 

CTGCAGCCGC CGGGGCAGCA GCCGTCGCAG CTGCTGAGAC GCTACAACGT CGCCAAAGTG 900 

CCCGCCAGCC CTACGCTGAG CAGCTCGGCG GAGTCCCCCG AGGGAGCGAG CCTCTACGAC 960 

GAGGTGCGGG CCGGCGCGAC CTCGGGCGCC GGGGGCGGCA GCCGCCTCTA CTACAGCTTC 1020 

AAGAACATCA CCAAGCAGCA CCCGCCGCCG CTCGCGCAGC CCGCGCTGTC GCCCGCGTCC 1080 

TCGCGCTCGG TGTCCACCTC CTCGTCCAGC AGCAGCGGCA GCAGCAGCGG CAGCAGCGGC 1140 

GAGGACGCCG ACGACCTGAT GTTCGACCTG AGCTTGAATT TCTCTCAAAG CGCGCACAGC 1200 

GCCAGCGAGC AGCAGCTGGG GGGCGGCGCG GCGGCCGGGA ACCTGTCCCT GTCGCTGGTG 1260 

GATAAGGATT TGGATTCGTT CAGCGAGGGC AGCCTGGGCT CCCACTTCGA GTTCCCCGAC 1320 

TACTGCACGC CGGAGCTGAG CGAGATGATC GCGGGGGACT GGCTGGAGGC GAACTTCTCC 1380 

GACCTGGTGT TCACATATTG AAAGGCGCCC GCTGCTCGCT CTTTCTCTCG GAGGGTGCAG 1440 

AGCTGGGTTC CTTGGGAGGA AGTTGTAGTG GTGATGATGA TGATGATGAT AATGATGATG 1500 

ATGATGGTGG TGTTGATGGT GGCGGTGGTA GGGTGGAGGG GAGAGAAGAA GATGCTGATG 1560 

ATATTGATAA GATGTCGTGA CGCAAAGAAA TTGGAAAACA TGATGAAAAT TTTGGTGGAG 1620 

TTAAAGTGAA ATGAGTAGTT TTTAAACATT TTTCCTGTCC TTTTTTTGTC CCCCCTCCCT 1680 

TCCTTTATCG TGTCTCAAGG TAGTTGCATA CCTAGTCTGG AGTTGTGATT ATTTTCCCAA 1740 

AAAATGTGTT TTTGTAATTA CTATTTCTTT TTCCTGAAAT TCGTGATTGC AACAAAGGCA 1800 

GAGGGGGCGG CGCGGCGGAG GGGAGGTAGG ACCCGCTCCG GAAGGCGCTG TTTGAAGCTT 1860 

GTCGGTCTTT GAAGTCTGGA AGACGTCTGC AGAGGACCCT TTTGGCAGCA CAACTGTTAC 1920 

TCTAGGGAGT TGGTGGAGAT ATTTTTTTTT CTTAAGAGAA CTTAAAGAAC TGGTGATTTT 1980 
TTTTTAACAA AAAAAGGG 

Seq ID NOt 657 Protein sequence 
Protein Accession ft: NP_003099.1 

1 11 21 31 41 51 

I I I I I I 

MVQQAESLEA ESNLPREALD TEEGEFMACS PVALDESDPD WCKTASGHIK RPMNAFMVWS 60 

KIERRKIMEQ SPDMHNAEIS KRLGKRWKML KDSEKIPPIR EAERLRLKHM ADYPDYKYRP 120 

RKKPKMDPSA KPSASQSPEK SAAGGGGGSA GGGAGGAKTS KGSSKKCGKL KAPAAAGAKA 180 

GAGKAAQSGD YGGAGDDYVL GSLRVSGSGG GGAGKTVKCV FLDEDDDDDD DDDELQLQIK 240 
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QEPDEEDEBP PHQQIiLQPPG QQPSQLLRRY NVAKVPASPT LSSSAESPEG ASLYDEVRAG 300 

ATSGAGGGSR LYYSFKNITK QHPPPLAQPA IiSPASSRSVS TSSSSSSGSS SGSSGEDAD0 360 

LMFDLSLNFS QSAHSASEQQ LGGGAAAGNL SLSLVDKDLD SPSEGSLGSH FEFPDYCTPE 420 
LSEMIAGDWL EANFSDLVFT Y 

Seg ID NO: 658 dna sequence 
Nucleic Acid Accession »i NM_001719 
Coding sequence: 123.. 1418 

1 11 21 31 41 51 

! II I I I 

GGGCGCAGCG GGGCCCGTCT GCAGCAAGTG ACCGACGGCC GGGACGGCCG CCTGCCCCCT 60 

CTGCCACCTG GGGCGGTGCG GGCCCGGAGC CCGGAGCCCG GGTAGCGCGT AGAGCCGGCG 120 

CGATGCACGT GCGCTCACTG CGAGCTGCGG CGCOGCACAG CTTCGTGGCG CTCTGGGCAC 180 

CCCTGTTCCT GCTGCGCTCC GCCCTGGCCG ACTTCAGCCT GGACAACGAG GTGCACTCGA 240 

GCTTCATCCA CCGGCGCCTC CGCAGCCAGG AGCGGCGGGA GATGCAGCGC GAGATCCTCT 300 

CCATTTTGGG CTTGCCCCAC CGCCCGCGCC CGCACCTCCA GGGCAAGCAC AACTCGGCAC 360 

CCATGTTCAT GCTGGACCTG TACAAOGCCA TGGCGGTGGA GGRGGGCGGC GGGCCCGGCQ 420 

GCCAGGGCTT CTCCTACCCC TACAAGGCCG TCTTCAGTAC CCAGGGCCCC CCTCTGGCCA 480 

GCCTGCAAGA TAGCCATTTC CTCACCGACG CCGACATGGT CATGAGCTTC GTCAACCTCG 540 

TGGAACATGA CAAGGAATTC TTCCACCCAC GCTACCACCA TCGAGAGTTC CGGTTTGATC 600 

TTTCCAAGAT CCCAGAAGGG GAAGCTGTCA CGGCAGCCGA ATTCCGGATC TACAAGGACT 660 

ACATCCGGGA ACGCTTCGAC AATGAGACGT TCCGGATCAG CGTTTATCAG GTGCTCCAGG 720 

AGCACTTGGG CAGGGAATCG GATCTCTTCC TGCTCGACAG CCGTACCCTC TGGGCCTCGG 780 

AGGAGGGCTG GCTGGTGTTT GACATCACAG CCACCAGCAA CCACTGGGTG GTCAATCCGC 840 

GGCACAACCT GGGCCTGCAG CTCTCGGTGG AGACGCTGGA TGGGCAGAGC ATCAACCCCA 900 

AGTTGGCGGG CCTGATTGGG CGGCACGGGC CCCAGAACAA GCAGCCCTTC ATGGTGGCTT 960 

TCTTCAAGGC CACGGAGGTC CACTTCCGCA GCATCCGGTC CACGGGGAGC AAACAGCGCA 1020 

GCCAGAACCG CTCCAAGACG CCCAAGAACC AGGAAGCCCT GCGGATGGCC AACGTGGCAG 1080 

AGAACAGCAG CAGCGACCAG AGGCAGGCCT GTAAGAAGCA CGAGCTGTAT GTCAGCTTCC 1140 

GAGACCTGGG CTGGCAGGAC TGGATCATCG CGCCTGAAGG CTACGCCGCC TACTACTGTG 1200 

AGGGGGAGTG TGCCTTCCCT CTGAACTCCT ACATGAACGC CACCAACCAC GCCATCGTGC 1260 

AGACGCTGGT CCACTTCATC AACCCGGAAA CGGTGCCCAA GCCCTGCTGT GCGCCCACGC 1320 

AGCTCAATGC CATCTCCGTC CTCTACTTCG ATGACAGCTC CAACGTCATC CTGAAGAAAT 1380 

ACAGAAACAT GGTGGTCCGG GCCTGTGGCT GCCACTAGCT CCTCCGAGAA TTCAGACCCT 1440 

TTGGGGCCAA GTTTTTCTGG ATCCTCCATT GCTCGCCTTG GCCAGGAACC AGCAGACCAA 1500 

CTGCCTTTTG TGAGACCTTC CCCTCCCTAT CCGCAACTTT AAAGGTGTGA GAGTATTAGG 1560 

AAACATGAGC AGCATATGGC TTTTGATCAG TTTTTCAGTG GCAGCATCCA ATGAACAAGA 1620 

TCCTACAAGC TGTGCAGGCA AAACCTAGCA GGAAAAAAAA ACAAOGCATA AAGAAAAATG 1680 

GCCGGGCCAG GTCATTGGCT GGGAAGTCTC AGCCATGCAC GGACTCGTTT CCAGAGGTAA- 1740 

TTATGAGCGC CTACCAGCCA GGCCACCCAG CCGTGGGAGG AAGGGGGCGT GGCAAGGGGT 1800 

GGGCACATTG GTGTCTGTGC GAAAGGAAAA TTGACCCGGA AGTTCCTGTA ATAAATGTCA 1860 
CAATAAAACG AATGAATG 

Seq ID NO t 659 Protein sequence 
Protein Accession #: NP_001710 

1 11 21 31 41 51 

I ! I I I I 

MHVRSLRAAA PHSFVALWAP LFLLRSALAD FSLDNEVHSS FIHRRLRSQE RREMQREILS 60 

ILGLPHRPRP HLQGKHNSAP MFMLDLYNAM AVEEGGGPGG QGFSYPYKAV FSTQGPPLAS 120 

LQDSHFLTDA DMVMSFVNliV EHDKEFFHPR YHHREFRFDt, SKIPEGEAVT AAEFRIYKDY 180 

IRERFDNETF RISVYQVLQE HLGRESDLFL LDSRTLWASE EGWLVFDITA TSNHWWNPR 240 

HNLGLQLSVE TLDGQSINPK IAGLIGRHGP QNKQPFMVAF FKATEVHFRS IRSTGSKQRS 300 

QNRSKTPKNQ EALRMANVAE NSSSDQRQAC KKHELYVSFR DLGWQDWIIA PEGYAAYYCE 360 

GECAFPLNSY MNATNHAIVQ TLVHFINPET VPKPCCAPTQ LNAISVLYFD DSSNVILKKY 420 
RNMWRACGC H 

Seq ID NO: 660 DNA sequence 

Nucleic Acid Accession Eos sequence 

Coding sequence: 211.. 1895 

1 11 21 31 41 51 

I I I 'I I I 

GGATCTGAGG GGCGCCCAGT CACTTCCTCC ACGTTCTCGT GCTGGGCGGG AGGAGCGGAT 60 

GGGGCTTGGG AGGCAGCCTG CTCTCCAGTC CCTATCCACC CACAGGTTTT TTGGGTCGGA 120 

GAGGAATTAT CTGATAAAAT TCCTGGGTTA ATATTTTTAA AAACGGAGAG TTTTTAAAAA 180 

TGATTTTTTT CCCTCGAAAA TGACCTTTTT ATGCTTCGAA GCAGTTTGTC AACCAGCATA 240 

GTGCTTTTTC TTTTCTCTTC TTTTTCTACG ATAAATGAAA GCATTTCTTC AAGAAAAAGG 300 

CACAGGTTCC TTGAACAGCT GGATTCTGAT GGCACCATTA CTATAGAGGA GCAGATTGTC 360 

CTTGTGCTGA AAGCGAAAGT ACAATGTGAA CTCAACATCA CAGCTCAACT CCAGGAGGGA 420 

GAAGGTAATT GTTTCCCTGA ATGGGATGGA CTCATTTGTT GGCCCAGAGG AACAGTGGGG 480 

AAAATATCGG CTGTTCCATG CCCTCCTTAT ATTTATGACT TCAACCATAA AGGAGTTGCT 540 

TTCCGACACT GTAACCCCAA TGGAACATGG GATTTTATGC ACAGCTTAAA TAAAACATGG 600 

GCCAATTATT CAGACTGCCT TCGCTTTCTG CAGCCAGATA TCAGCATAGG AAAGCAAGAA 660 

TTCTTTGAAC GCCTCTATGT AATGTATACC GTTGGCTACT CCATCTCTTT TGGTTCCTTG 720 

GCTGTGGCTA TTCTCATCAT TGGTTACTTC AGACGATTGC ATTGCACTAG GAACTATATC 780 

CACATGCACT TATTTGTGTC TTTCATGCTG AGAGCTACAA GCATCTTTGT CAAAGACAGA 840 

GTAGTCCATG CTCACATAGG AGTAAAGGAG CTGGAGTCCC TAATAATGCA GGATGACCCA 900 

CAAAATTCCA TTGAGGCAAC TTCTGTGGAC AAATCACAAT ATATCGGGTG CAAGATTGCT 960 

GTTGTGATGT TTATTTACTT CCTGGCTACA AATTATTATT GGATCCTGGT GGAAGGTCTC 1020 

TACCTGCATA ATCTCATCTT TGTGGCTTTC TTTTCGGACA CCAAATACCT GTGGGGCTTC 1080 

ATCTTGATAG GCTGGGGGTT TCCAGCAGCA TTTGTTGCAG CATGGGCTGT GGCACGAGCA 1140 

ACTCTGGCTG ATGCGAGGTG CTGGGAACTT AGTGCTGGAG ACATCAAGTG GATTTATCAA 1200 

GCACCGATCT TAGCAGCTAT TGGGCTGAAT TTTATTCTGT TTCTGAATAC GGTTAGAGTT 1260 

CTAGCTACCA AAATCTGGGA GACCAATGCA GTTGGGCATG ACACAAGGAA GCAATACAGG 1320 

AAACTGGCCA AATOGACACT GGTCCTGGTC CTAGTCTTTG GAGTGCATTA CATCGTGTTC 1380 
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GTATGCCTGC CTCACTCCTT CACTGGGCTC GGGTGGGAGA TCCGCATGCA CTGTGAGCTC 1440 

TTCTTCAACT CCTTTCAGGO TTTCTTTGTG TCTATCATCT ACTGCTACTG CAATGGAGAG 1500 

GTTCAGGCAG AGGTGAAGAA GATGTGGAGT CGGTGGAATC TCTCOGTGGA CTGGAAAAGG 1560 

ACACCGCCAT GTGGCAGCCG CAGATGCGGC TCAGTGCTCA CCACOGTGAC GCACAGCACC 1620 

AGCAGCCAGT CACAGGTGGC GGCCAGCACA CGCATGGTGC TTATCTCTGG CAAAGCTGCC 1680 

AAGATCGCCA GCAGACAGCC TGACAGCCAC ATCACTTTAC CTGGCTATGT CTGGAGTAAC 1740 

TCAGAGCAGG ACTGCCTGCC ACACTCTTTC CACGAGGAGA CCAAGGAAGA TAGTGGGAGG 1800 

CAGGGAGATG ATATTCTAAT GGAGAAGCCT TCCAGGCCTA TGGAATCTAA CCCAGACACT 1860 
GAAGGATGCC AAGGAGAAAC TGAGGATGTT CTCTGA 

Seq ID NO > 661 Protein sequence 
Protein Accession ft: Eos sequence 

1 11 21 31 41 51 

I I I I I I 

MLRSSLSTSI VLFLFSSFST INBSISSRKR HRFLEQLDSD GTITIEEQIV LVLKAKVQCE 60 

LNITAQLQEG EGNCFPEWDG LICWPRGTVG KISAVPCPPY IYDFNHKGVA FRHCNPNGTW 120 

DPMHSLNKTW ANYSDCLRFL QPOISIGKQE FFERLYVMYT VGYSISFGSL AVAILI IGYF 180 

RRLHCTRNYI HMHLFVSFMI* RATSIFVKDR WHAHIGVKE LESLIMQDDP QNSIEATSVD 240 

KSQYIGCKIA WMFIYFLAT NYYWILVEGL YLHNLIFVAF FSDTKYLWGF ILIGWGFPAA 300 

FVAAWAVARA TLADARCWEL SAGDIKWIYQ APILAAIGLN FILFLNTVRV LATKIWETNA 360 

VGHDTRKQYR KLAKSTLVLV LVFGVHYIVF VCLPHSFTGL GWEIRMHCEL FFNSFQGFFV 420 

SIIYCYCNGE VQAEVKKMWS RWNLSVDWKR TPPCGSRRCG SVLTTVTHST SSQSQVAAST 480 

RMVLISGKAA KIASRQPDSH ITLPGYVWSN SEQDCLPHSF HEETKEDSGR QGDDILMEKP 540 
SRPMESNPDT EGCQGETEDV L 

Seq ID NO : 662 DNA sequence 
Nucleic Acid Accession 8: NM_005048 
Coding sequences 143.. 1795 

1 11 21 31 41 51 

I I I I I I 

GGCCGGTGGC CCGGGCCCGA CCACCCCAGC TGCGCGTCGT TACTGGCCAC AAGTTTGCTC 60 

TGGGCCAGCC AAGTTGGCAA CTTGGAAGCT TCTCCCGGGC TCTGGAGGAG GGTCCCTGCT 120 

TCTTCCTACA GCCGTTCCGG GCATGGCCGG GCTGGGGGCG TCGCTCCACG TCTGGGGTTG 180 

GCTAATGCTC GGCAGCTGCC TCCTGGCCAG AGCCCAGCTG GATTCTGATG GCACCATTAC 240 

TATAGAGGAG CAGATTGTCC TTGTGCTGAA .AGCGAAAGTA CAATGTGAAC TCAACATCAC 300 

AGCTCAACTC CAGGAGGGAG AAGGTAATTG TTTCCCTGAA TGGGATGGAC TCATTTGTTG 360 

GCCCAGAGGA ACAGTGGGGA AAATATCGGC TGTTCCATGC CCTCCTTATA TTTATGACTT 420 

CAACCATAAA GGAGTTGCTT TCCGACACTG TAACCCCAAT GGAACATGGG ATTTTATGCA 480 

CAGCTTAAAT AAAACATGGG CCAATTATTC AGACTGCCTT CGCTTTCTGC AGCCAGATAT 540 

CAGCATAGGA AAGCAAGAAT TCTTTGAACG CCTCTATGTA ATGTATACCG TTGGCTACTC 600 

CATCTCTTTT GGTTCCTTGG CTGTGGCTAT TCTCATCATT GGTTACTTCA GACGATTGCA 660 

TTGCACTAGG AACTATATCC ACATGCACTT ATTTGTGTCT TTCATGCTGA GAGCTACAAG 720 

CATCTTTGTC AAAGACAGAG TAGTCCATGC TCACATAGGA GTAAAGGAGC TGGAGTCCCT 780 

AATAATGCAG GATGACCCAC AAAATTCCAT TGAGGCAACT TCTGTGGACA AATCACAATA 840 

TATCGGGTGC AAGATTGCTG TTGTGATGTT TATTTACTTC CTGGCTACAA ATTATTATTG 900 

GATCCTGGTG GAAGGTCTCT ACCTGCATAA TCTCATCTTT GTGGCTTTCT TTTCGGACAC 960 

CAAATACCTG TGGGGCTTCA TCTTGATAGG CTGGGGGTTT CCAGCAGCAT TTGTTGCAGC 1020 

ATGGGCTGTG GCACGAGCAA CTCTGGCTGA TGCGAGGTGC TGGGAACTTA GTGCTGGAGA 1080 

CATCAAGTGG ATTTATCAAG CACCGATCTT AGCAGCTATT GGGCTGAATT TTATTCTGTT 1140 

TCTGAATACG GTTAGAGTTC TAGCTACCAA AATCTGGGAG ACCAATGCAG TTGGGCATGA 1200 

CACAAGGAAG CAATACAGGA AACTGGCCAA ATCGACACTG GTCCTGGTCC TAGTCTTTGG 1260 

AGTGCATTAC ATCGTGTTCG TATGCCTGCC TCACTCCTTC ACTGGGCTCG GGTGGGAGAT 1320 

CCGCATGCAC TGTGAGCTCT TCTTCAACTC CTTTCAGGGT TTCTTTGTGT CTATCATCTA 1380 

CTGCTACTGC AATGGAGAGG TTCAGGCAGA GGTGAAGAAG ATGTGGAGTC GGTGGAATCT 1440 

CTCCGTGGAC TGGAAAAGGA CACCGCCATG TGGCAGCCGC AGATGCGGCT CAGTGCTCAC 1500 

CACCGTGACG CACAGCACCA GCAGCCAGTC ACAGGTGGCG GCCAGCACAC GCATGGTGCT 1560 

TATCTCTGGC AAAGCTGCCA AGATCGCCAG CAGACAGCCT GACAGCCACA TCACTTTACC 1620 

TGGCTATGTC TGGAGTAACT CAGAGCAGGA CTGCCTGCCA CACTCTTTCC ACGAGGAGAC 1680 

CAAGGAAGAT AGTGGGAGGC AGGGAGATGA TATTCTAATG GAGAAGCCTT CCAGGCCTAT 1740 

GGAATCTAAC CCAGACACTG AAGGATGCCA AGGAGAAACT GAGGATGTTC TCTGAATGGA 1800 

CATTTGTGGC TGACTTTCAT GGGCTGGTCC AATGGCTGGT TGTGTGAGAG GGCTTGGCTG 1860 

ATACTCCTAT GCTTGAGTTC AAAGGCTGAA AATTCAGTTA AGGTGTTACT TAATAATAGT 1920 

TTTTAGGCTC CATGAATTGG CTCCTGTAAA TACTAACGAC ATGAAAATGC AAGTGTCAAT 1980 

GGAGTAGTTT ATTACCTTCT ATTGGCATCA AGTTTTCCTC TAAATTAATG TATGGTATTT 2040 

GCTCTGTGAT TGTTCATTTT TTTCTGCTAC TTTTGGGTAG AAAAAAGATT CAATTGCTTG 2100 

GCTGTAGCTT TCTCTCATAT ATATCACCCT AAATATAATG AAGATCTTTT AGTGTGTATC 2160 

ATTTTCCTTT TAGAAACTAG TATTCTCTTA TTTCTTACTT TAATGTACTT CTATCACTGC 2220 

ATTTATTTTG CCTGTGCATA GGAGCAATTA GGATCTAAAA AAATATATGG GAAGATAAAA 2280 

GATCTAAGAA CAAGTACTTG CTGGAAAATT AGTTGGCTGG ACATTGATAA AATAATGCAT 2340 

TTATAACAAT TACATGTGTT TTTGGGAACA AGGAAAATTT CTCAAAAAAG AATATTTCAC 2400 

ACATCCCTTC TTTTGAATGG CCTCTTTGTG ACCAGCCAGA CCTCAGGTCT TCACTCTTTC 2460 

TTCTTTGTAA ACCATGTCAT GTGGAAAGAT TTCCTCAGTT AGTGAGCTTG TGTCTGCAAA 2520 

TTGATTTTGT TTGTAATGTA TTTTGATAGC AAATCATGCT GCATCTATAT CTTTTTCTTG 2580 

TTTGAGCTGT TACTACATTG TACATGGCAT GTGGGATCAA TTAAAAATTT GTTTTAAAAA 2640 
T 

Seq ID NO: 663 Protein sequence 
Protein Accession #: NP_005039 

1 11 21 31 41 51 

I I I I I I 

MAGLGASLHV WGWLMLGSCL LARAQLDSDG TITIEEQIVL VLKAKVQCEL NITAQLQEGB 60 

GNCFPEWDGL ICWPRGTVGK ISAVPCPPYI YDFNHKGVAF RHCNPNGTWD FMHSLNKTWA 120 

NYSDCLRFLQ PDISIGKQEF FERLYVMYTV GYSISFGSLA VAILIIGYFR RLHCTRNYIH 180 

MHLFVSFMLR ATSIFVKDRV VHAHIGVKEL ESLIMQDDPQ NSIEATSVDK SQYIGCKIAV 240 
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VMFIYFLATN YYWILVEGLY LHNLIPVAFP SDTKYLWGFI LIGVJGFPAAF VAAWAVARAT 300 

LADARCWELS AGDIKWIYQA PILAAIGLNP ILFLNTVRVL ATKIWETNAV GKDTRKQYRK 360 

LAKSTLVLVL VFGVHYIVFV CLPHSFTGLG WEIRMHCELF FNSFQGFFVS IIYCYCNGEV 420 

QAEVKKMWSR WNLSVDWKRT PPCGSRRCGS VLTTVTHSTS SQSQVAASTR MVLISGKAAK 480 

IASRQPDSHI TLPGYVWSNS EQDCLPHSFH EETKEDSGRQ GDDILMEKPS RPMESNPDTE 540 
GCQGETEDVL 

Seq ID KOt 664 DNA sequence 
Nucleic Acid Accession Ut NM_012152 
Coding sequence: 43.. 1X04 

1 XI 21 3X 41 51 

I 1 I I I I 

CTTCTTTAAA TTTCTTTCTA GGATGTTCAC TTCTTCTCCA CAATGAATGA GTGTCACTAT 60 

GACAAGCACA TGGACTTTTT TTATAATAGG AGCAACACTG ATACTGTCGA TGACTGGACA 120 

GGAACAAAGC TTGTGATTGT TTTGTGTGTT GGGACGTTTT TCTGCCTGTT TATTTTTTTT 180 

TCTAATTCTC TGGTCATCGC GGCAGTGATC AAAAACAGAA AATTTCATTT CCCCTTCTAC 240 

TACCTGTTGG CTAATTTAGC TGCTGCOGAT TTCTTCGCTG GAATTGCCTA TGTATTCCTG 300 

ATGTTTAACA CAGGCCCAGT TTCAAAAACT TTGACTGTCA ACCGCTGGTT TCTCCGTCAG 360 

GGGCTTCTGG ACAGTAGCTT GACTGCTTCC CTCACCAACT TGCTGGTTAT CGCCGTGGAG 420 

AGGCACATGT CAATCATGAG GATGCGGGTC CATAGCAACC TGACCAAAAA GAGGGTGACA 480 

CTGCTCATTT TGCTTGTCTG GGCCATCGCC ATTTTTATGG GGGCGGTCCC CACACTGGGC 540 

TGGAATTGCC TCTGCAACAT CTCTGCCTGC TCTTCCCTGG CCCCCATTTA CAGCAGGAGT 600 

TACCTTGTTT TCTGGACAGT GTCCAACCTC ATGGCCTTCC TCATCATGGT TGTGGTGTAC 660 

CTGCGGATCT ACGTGTACGT CAAGAGGAAA ACCAACGTCT TGTCTCCGCA TACAAGTGGG 720 

TCCATCAGCC GCCGGAGGAC ACCCATGAAG CTAATGAAGA OGGTGATGAC TGTCTTAGGG 780 

GCGTTTGTGG TATGCTGGAC CCCGGGCCTG GTGGTTCTGC TCCTCGACGG CCTGAACTGC 840 

AGGCAGTGTG GCGTGCAGCA TGTGAAAAGG TGGTTCCTGC TGCTGGCGCT GCTCAACTCC 900 

GTCGTGAACC CCATCATCTA CTCCTACAAG GACGAGGACA TGTATGGCAC CATGAAGAAG 960 

ATGATCTGCT GCTTCTCTCA GGAGAACCCA GAGAGGCGTC CCTCTCGCAT CCCCTCCACA 1020 

GTCCTCAGCA GGAGTGACAC AGGCAGCCAG TACATAGAGG ATAGTATTAG CCAAGGTGCA 1080 

GTCTGCAATA AAAGCACTTC CTAAACTCTG GATGCCTCTC GGCCCACCCA GGTGATGACT 1140 
GTCTTAGG 

Seq ID NO: 665 Protein sequence 
Protein Accession #: NP_036284 

1 11 21 31 41 51 

I I I I 1 I 

MNECHYDKHM DFFYNRSNTD TVDDWTGTKL VIVLCVGTFF CLFIFFSNSL VIAAVIKNRK 60 

FHFPFYYLLA NLAAADFFAG IAYVFLMFNT GPVSKTLTVN RWFLRQGIiLD SSI>TASt/TNL 120 

LVIAVERHMS IMRMRVHSNL TKKRVTIiLIL LVWAIAIFMG AVPTLGWNCL CN I SACS SLA 180 

PIYSRSYLVP WTVSNLMAPL IMWVYLRIY VYVKRKTNVL SPHTSGSISR RRTPMKLMKT 240 

VMTVLGAFW CWTPGLWLL LDGLNCRQCG VQHVKRWFLL LALLNSWNP IIYSYKDEDM 300 
YGTMKKMICC FSQENPERRP SRIPSTVLSR SDTGSQYIED SISQGAVCNK STS 

Seq ID NO: 666 DNA sequence 
Nucleic Acid Accession *r NM_002821 
Coding sequence: 150.. 3362 

1 11 21 31 41 51 

I I I I 1 I 

AACTCCCGCC TCGGGACGCC TCGGGGTCGG GCTCCGGCTG CGGCTGCTGC TGCGGCGCCC 60 

GCGCTCCGGT GCGTCCGCCT CCTGTGCCCG COGCGGAGCA GTCTGCGGCC CGCCGTGCGC 120 

CCTCAGCTCC TTTTCCTGAG CCCGCCGCGA TGGGAGCTGC GCGGGGATCC CCGGCCAGAC 180 

CCCGCCGGTT GCCTCTGCTC AGCGTCCTGC TGCTGCCGCT GCTGGGCGGT ACCCAGACAG 240 

CCATTGTCTT CATCAAGCAG CCGTCCTCCC AGGATGCACT GCAGGGGCGC CGGGCGCTGC 30 0 

TTCGCTGTGA GGTTGAGGCT CCGGGCCCGG TACATGTGTA CTGGCTGCTC GATGGGGCCC 360 

CTGTCCAGGA CACGGAGCGG CGTTTCGCCC AGGGCAGCAG CCTGAGCTTT GCAGCTGTGG 420 

ACCGGCTGCA GGACTCTGGC ACCTTCCAGT GTGTGGCTCG GGATGATGTC ACTGGAGAAG 480 

AAGCCCGCAG TGCCAACGCC TCCTTCAACA TCAAATGGAT TGAGGCAGGT CCTGTGGTCC 540 

TGAAGCATCC AGCCTCGGAA GCTGAGATCC AGCCACAGAC CCAGGTCACA CTTCGTTGCC 600 

ACATTGATGG GCACCCTCGG CCCACCTACC AATGGTTCCG AGATGGGACC CCCCTTTCTG 660 

ATGGTCAGAG CAACCACACA GTCAGCAGCA AGGAGCGGAA CCTGACGCTC CGGCCAGCTG 720 

GTCCTGAGCA TAGTGGGCTG TATTCCTGCT GCGCCCACAG TGCTTTTGGC CAGGCTTGCA 780 

GCAGCCAGAA CTTCACCTTG AGCATTGCTG ATGAAAGCTT TGCCAGGGTG GTGCTGGCAC 840 

CCCAGGACGT GGTAGTAGCG AGGTATGAGG AGGCCATGTT CCATTGCCAG TTCTCAGCCC 900 

AGCCACCCCC GAGCCTGCAG TGGCTCTTTG AGGATGAGAC TCCCATCACT AACCGCAGTC 960 

GCCCCCCACA CCTCCGCAGA GCCACAGTGT TTGCCAACGG GTCTCTGCTG CTGACCCAGG 1020 

TCCGGCCACG CAATG CAGGG ATCTACCGCT GCATTGGCCA GGGGCAGAGG GGCCCACCCA 1080 

TCATCCTGGA AGCCACACTT CACCTAGCAG AGATTGAAGA CATGCCGCTA TTTGAGCCAC 1140 

GGGTGTTTAC AGCTGGCAGC GAGGAGCGTG TGACCTGCCT TCCCCCCAAG GGTCTGCCAG 1200 

AGCCCAGCGT GTGGTGGGAG CACGCGGGAG TCCGGCTGCC CACCCATGGC AGGGTCTACC 1260 

AGAAGGGCCA CGAGCTGGTG TTGGCCAATA TTGCTGAAAG TGATGCTGGT GTCTACACCT 1320 

GCCACGCGGC CAACCTGGCT GGTCAGCGGA GACAGGATGT CAACATCACT GTGGCCACTG 1380 

TGCCCTCCTG GCTGAAGAAG CCCCAAGACA GCCAGCTGGA GGAGGGCAAA CCCGGCTACT 1440 

TGGATTGCCT GACCCAGGCC ACACCAAAAC CTACAGTTGT CTGGTACAGA AACCAGATGC 1500 

TCATCTCAGA GGACTCACGG TTCGAGGTCT TCAAGAATGG GACCTTGCGC ATCAACAGCG 1560 

TGGAGGTGTA TGATGGGACA TGGTACCGTT GTATGAGCAG CACCCCAGCC GGCAGCATCG 1620 

AGGCGCAAGC CCGTGTCCAA GTGCTGGAAA AGCTCAAGTT CACACCACCA CCCCAGCCAC 1680 

AGCAGTGCAT GGAGTTTGAC AAGGAGGCCA CGGTGCCCTG TTCAGCCACA GGCCGAGAGA 1740 

AGCCCACTAT TAAGTGGGAA CGGGCAGATG GGAGCAGCCT CCCAGAGTGG GTGACAGACA 1800 

ACGCTGGGAC CCTGCATTTT GCCCGGGTGA CTCGAGATGA CGCTGGCAAC TACACTTGCA I860 

TTGCCTCCAA OGGGCCGCAG GGCCAGATTC GTGCCCATGT CCAGCTCACT GTGGCAGTTT 1920 

TTATCACCTT CAAAGTGGAA CCAGAGCGTA CGACTGTGTA CCAGGGCCAC ACAGCCCTAC 1980 

TGCAGTGCGA GGCCCAGGGG GACCCCAAGC CGCTGATTCA GTGGAAAGGC AAGGACCGCA 2040 

TCCTGGACCC CACCAAGCTG GGACCCAGGA TGCACATCTT CCAGAATGGC TCCCTGGTGA 2100 
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TCCATGACGT GGCCCCTGAQ GACTCAGGCC GCTACACCTG 
ACATCAAGCA CACGGAGGCC CCCCTCTATG TCGTGGACAA 
AGGGCCCTGG CAGCCCTCCC CCCTACAAGA TGATCCAGAC 
COGCTGTGGC CTACATCATT GCCGTGCTGG GCCTCATGTT 
AAGCCAAGCG GCTGCAGAAG CAGCCCGAGG GCGAGGAGCC 
GAGGGCCTTT GCAGAACGGG CAGCCCTCAG CAGAGATCCA 
GCTTGGGCTC CGGCCCCGCG GCCACCAACA AACGCCACAG 
TCCCACGGTC TAGCCTGCAG OCCATCACCA CGCTGGGGAA 
TCCTGGCAAA GGCTCAGGGC TTGGAGGAGG GAGTGGCAGA 
GCCTGCAGAC GAAGGATGAG CAGCAGCAGC TGGACTTCCG 
GGAAGCTGAA CCACGCCAAC GTGGTGCGGC TCCTGGGGCT 
ACTACAtGGT GCTGGAATAT GTGGATCTGG GAGACCTCAA 
AGAGCAAGGA TGAAAAATTG AAGTCACAGC CCCTCAGCAC 
GCACCCAGGT AGCCCTGGGC ATGGAGCACC TGTCCAACAA 
TGGCTGCGCG TAACTGCCTG GTCAGTGCCC AGAGACAAGT 
TCAGCAAGGA TGTGTACAAC AGTGAGTACT ACCACTTCCG 
GCTGGATGTC CCCCGAGGCC ATCCTGGAGG GTGACTTCTC 
CCTTOGGTGT GCTGATGTGG GAAGTGTTTA CACATGGAGA 
CAGATGATGA AGTACTGGCA GATTTGCAGG CTGGGAAGGC 
GCTGCCCTTC CAAACTCTAT CGGCTGATGC AGCGCTGCTG 
GGCCCTCCTT CAGTGAGATT GCCAGCGCCC TGGGAGACAG 
GAGGAGGGAG CCCGCTCAGG ATGGCCTGGG CAGGGGAGGA 
CAGCATGATG GGCAAGATCC CTGTCCTCCT GGGCCCTGAG 
TTGCTGAGGT CTGAGCAGGG CCTGGCCTTT CCTCCTCTTC 
GGCTGACTTG GACCCAAACT GGGCGACTAG GGCTTTGAGC 
CTCTTCCTCT ATCAGGGACA GTGTGGGTGC CACAGGTAAC 
TTCTCCCCTT GACCGGGTCC AACTCTGCCA CTCATCTGCC 
AGGCTTGGGA TGAGCTGGGT TTGTGGGGAG TTCCTTAATA 
AGGGTTAATG AGTCTCTTGC CCACTGGTCC ACTTGGGGGT 
ACACAGCAAG TGAGTCCTCC CCACTCTGGG CTTGTGCACA 
CCCCACCCTT CTCTCCTTTC CTCATCCTAA GTGCCTGGCA 
CTTTTGACAC TATATAAACC GCCCTTTTTG TATGCACCAC 
TGCAGCGTGG GGTGGGTGGG CATGGGAGGT AGGGGTGGGC 
GCCATCCTTA CCCCACACTT TTATTGTTGT CGTTTTTTGT 
TGTTTTTGTT TTTACACTCG CTGCTCTCAA TAAATAAGCC 

Seq ID NO: 667 Protein sequence 
Protein Accession ft: NP 002812 



MGAARGSPAR 
VHVYWLLDGA 
IKWIEAGPW 
KERNLTLRPA 
EAMPHCQPSA 
CIGQGQRGPP 
VRLPTHGRVY 
SQLEEGKPGY 
CMSSTPAGSI 
GSSLPEWVTD 
TTVYQGHTAL 
RYTCIAGNSC 
GLMPYCKKRC 
KRHSTSDKMH 
LDFRRELEMF 
PLSTKQKVAL 
YHFRQAWVPL 
AGKARLPQPE 



11 
I 

PRRLPLLSVL 
PVQDTERRFA 
LKHPASEAEI 
GPEHSGLYSC 
QPPPSLQWLF 
IILEATLHLA 
QKGHELVLAN 
LDCfcTQATPK 
EAQARVQVLE 
NAGTLHFARV 
LQCEAQGDPK 
NIKHTEAPLY 
KAKRLQKQPE 
PPRSSLQPIT 
GKLNHANWR 
CTQVALGMEH 
RWMSPEAILE 
GCPSKLYRLM 



21 
I 

LLPLLGGTQT 
QGSSLSFAAV 
QPQTQVTLRC 
CAHSAFGQAC 
EDETPITNRS 
EIEDMPLFEP 
IAESDAGVYT 
PTWWYRNQM 
KLKFTPPPQP 
TRDDAGNYTC 
PLIQWKGKDR 
WDKPVPEES 
GEEPEMECLN 
TLGKSEFGEV 
LLGLCREAEP 
LSNNRFVHKD 
GDFSTKSDVW 
QRCWALSPKD 



31 
I 

AIVFIKQPSS 
DRLQDSGTFQ 
HIDGHPRPTY 
SSQNFTLSIA 
RPPHLRRATV 
RVFTAGSEER 
CKAANLAGQR 
LISEDSRFEV 
QQCMEFDKEA 
IASNGPQGQI 
ILDPTKLGPR 
EGPGSPPPYK 
GGPLQNGQPS 
FLAKAQGLEE 
HYMVLEYVDL 
LAARKCXiVSA 
AFGVLMWEVF 
RPSFSEIASA 



CATTGCAGGC 
GCCTGTGCCG 
CATTGGGTTG 
CTACTGCAAG 
AGAGATGGAA 
AGAAGAAGTG 
CACAAGTGAT 
GAGTGAGTTT 
GACCCTGGTA 
GAGGGAGTTG 
GTGCCGGGAG 
GCAGTTCCTG 
CAAGCAGAAG 
CCGCTTTGTG 
GAAGGTGTCT 
CCAGGCCTGG 
TACCAAGTCT 
GATGCCCCAT 
TAGACTTCCT 
GGCCCTCAGC 
CACCGTGGAC 
CATCTCTAGA 
GTGCCCTAGT 
CTCACCCTCA 
TGGGCAGTTT 
CCCAATTTCT 
AACTTTGCCT 
TTCTCAAGTT 
CTAGACCAGG 
CTGACCCAGA 
GATGAAGGAG 
GGGCGGCTTT 
CCTGGAGATG 
TTGTTTTGTT 
TTTTTTA 



41 
I 

QDALQGRRAL 
CVARDDVTGE 
QWFRDGTPLS 
DESFARWLA 
FANGSLLLTQ 
VTCLPPKGLP 
RQDVNITVAT 
FKNGTLRINS 
TVPCSATGRB 
RAHVQLTVAV 
MHIFQNGSLV 
MIQTIGLSVG 
AEIQEEVALT 
GVAETLVLVK 
GDLKQFLRIS 
QRQVKVSALG 
THGEMPHGGQ 
LGDSTVDSKP 



AACAGCTGCA 
GAGGAGTCGG 
TCGGTGGGTG 
AAGCGCTGCA 
TGCCTCAACG 
GCCTTGACCA 
AAGATGCACT 
GGGGAGGTGT 
CTTGTGAAGA 
GAGATGTTTG 
GCTGAGCCCC 
AGGATTTCCA 
GTGGCCCTAT 
CATAAGGACT 
GCCCTGGGCC 
GTGCCGCTGC 
GATGTCTGGG 
GGTGGGCAGG 
CAGCCCGAGG 
CCCAAGGACC 
AGCAAGCCGT 
GGGAAGCTCA 
GCAACAGGCA 
TCCTTTGGGA 
CCCCTGCCAC 
GGCCTTCAAC 
GGGGAGGGCT 
CTGGGCACA'C 
ATTATAGAGG 
CCCACGTCTT 
TTTTCAGGAG 
TATATGTAAT 
AGGAGGGTGG 
TTTTTGTTTT 



51 
I 

LRCEVEAPGP 
EARS ANAS FN 
DGQSNHTVSS 
PQDVWARYE 
VRPRNAGIYR 
EPSVWWEHAG 
VPSWLKKPQD 
VEVYDGTWYR 
KPTIKWERAD 
PITFKVEPER 
IHDVAPEDSG 
AAVAYIIAVL 
SLGSGPAATN 
SLQTKDEQQQ 
KSKDEKLKSQ 
LSKDVYNSEY 
ADDEVLADLQ 



Seq ID NO: 668 DNA sequence 

Nucleic Acid Accession ft: Eos sequence 

Coding sequence: 1. .1389 



1 
I 

ATGGGCTACC 
ACCCTTGTTT 
GTTGTCAACT 
GGGTTTCCTT 
GTTTTATTGA 
AAAACTTTCG 
ATAGCAATGA 
ATCCCAGGAG 
ACAGTTACCT 
TCCCTCATCT 
TCACTGGGTC 
ATTCAAGCGG 
TACAGTTCTC 
GTGATTTCTG 
TTCACCCAAG 
AGATTTTGTT 
GAGGTAATTG 
ACAGTGATGG 
GTTCTAGAAC 
TGTTATCTGA 
ATGCTTCCCA 



11 
I 

AGAGGCAGGA 
CTGAACATGA 
CGATTATAGG 
TGGGAATATT 
TAAAAGGAGG 
GCTTTCCAGG 
TAAGTTACAA 
TTGATCCTGA 
TTACTCTGCC 
CTACAGGTTT 
CACACATACC 
TCGGGGTTAT 
TAGAAGAACC 
TATTTATCTG 
GGGACTTATT 
ATGGTGTCAC 
CCAATGTGTT 
TCATCACTGT 
TCAATGGTGT 
AACTGTCTGA 
TTGGTGCTGT 



21 
I 

GCCTGTCATC 
GTATAAAGAG 
ATCTGGTATA 
GCTTTTATTC 
GGCCCTCTCT 
GTATCTGCTC 
TATAATAGCT 
AAACGTGTTT 
TTTATCCTTG 
AACAACTCTG 
AAAAACAGAA 
GTCTTTTGCA 
CACAGTAGCT 
TATATTCTTT 
TGAAAATTAC 
TGTCATTTTG 
TTTTGGTGGG 
AGCCACGCTT 
GCTCTGTGCA 
AGAACCAAGG 
GGTGATGGTT 



31 
I 

CCGCCGCAGA 
AAAACCTGTC 
ATAGGATTGC 
TGGGTTTCAT 
GGAACAGATA 
CTCTCTGTTC 
GGAGATACTT 
ATTGGTCGCC 
TACCGAAATA 
ATTCTTGGAA 
GACGCTTGGG 
TTTATTTGCC 
AAGTGGTCCC 
GCTACATGTG 
TGCAGAAATG 
ACATACCCTA 
AATCTTTCAT 
GTGTCATTGC 
ACTCCCCTCA 
ACACACTCCG 
TTTGGATTCQ 



41 
I 

GAGATTTAGA 
AGTCTGCTGC 
CTTATTCAAT 
ATGTTACGGA 
CCTACCAGTC 
TTCAGTTTTT 
TGAGCAAAGT 
ACTTCATTAT 
TAGCAAAGCT 
TTGTAATGGC 
TATTTGCAAA 
ACCATAACTC 
GCCTTATCCA 
GATACTTGAC 
ATGACCTGGT 
TGGAATGCTT 
CGGTTTTCCA 
TGATTGATTG 
TTTTTATCAT 
ATAAGATTAT 
TCATGGCTAT 



2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3'780 
3 840 
3900 
3960 
4020 
4080 
4140 
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51 
I 

TGACAGAGAA 
TCTTTTTAAT 
GAAGCAAGCT 
CTTTTCCCTT 
TTTGGTCAAT 
GTATCCTTTT 
TTTTCAAAGA 
TGGACTTTCC 
TGGAAAGGTC 
AAGGGCAATT 
GCCCAATGCC 
CTTCTTAGTT 
TATGTCCATC 
ATTTACTGGC 
AACATTTGGA 
TGTGACAAGA 
CATTGTTGTA 
CCTCGGGATA 
TCCATCAGCC 
GTCTTGTGTC 
TACAAATACT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
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CAAGACTGCA CCCATGGGCA GGAAATGTTC TACTGCTTTC CTGACAATTT CTCTCTCACA 
AATACCTCAQ AGTCTCATGT TCAGCAGACA ACACAACTTT CTACTTTAAA TATTAGTATC 
TTTCAATQA 



1320 
1380 



PCT/US02/12476 



Seq ID NO: 669 Protein sequence 
Protein Accession 8: Cos sequence 
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MGYQRQEPVI 
GFPLGILLLF 
IAMISYNIIA 
SLISTGLTTL 
YSSLEBPTVA 
RFCYGVTVIL 
VLELNGVLCA 
QDCTHGQEMF 



11 
I 

PPQRDLDDRE 
WVSYVTDFSL 
GDTLSKVFQR 
ILGIVMARAI 
KWSRLIHMSI 
TYPMECFVTR 
TPLIFIIPSA 
YCFPDNFSLT 



21 
I 

TLVSEHBYKE 
VLLIKGGALS 
IPGVDPENVF 
SLGPHIPKTB 
VISVFICIFF 
EVIANVFFGG 
CYLKLSEEPR 
NTSESHVQQT 



31 
I 

KTCQSAALFN 
GTDTYQSLVN 
IGRHFI IGLS 
DAWVFAKPNA 
ATCGYLTFTG 
NLSSVFHIW 
THSDKIMSCV 
TQLSTLNISI 



41 
I 

WNSIIGSGI 
KTFGFPGYU, 
TVTFTLPLSL 
IQAVGVMSFA 
FTQGDLFENY 
TVMVITVATL 
MLPIGAWMV 
FQ 



51 

I 

IGhPYSMKQA 
LSVLQFkYPF 
YRNIAKLGKV 
FICHHKSFLV 
CRNDDLVTFG 
VSLLIDCLGI 
FGFVMAITNT 



Seq ID MO: 670 DHA sequence 

Nucleic Acid Accession 8: Eos sequence 

Coding sequence: 1..1284 



ATGGGCTACC 
AAGCAAGCTG 
TTTTGCCTTG 
TTGGTCAATA 
TATCCTTTTA 
TTTCAAAGAA 
GGACTTTCCA 
GGAAAGGTCT 
AGGGCAATTT 
CCCAATGCCA 
TTCTTAGTTT 
ATGTCCATCG 
TTTACTGGCT 
ACATTTGGAA 
GTGACAAGAG 
ATTGTTGTAA 
CTCGGGATAG 
CCATCAGCCT 
TCTTGTGTCA 
ACAAATACTC 
TCTCTCACAA 
ATTAGTATCT 



11 

I 

AGAGGCAGGA 
GGTTTCCTTT 
TTTTATTGAT 
AAACTTTCGG 
TAGCAATGAT 
TCCCAGGAGT 
CAGTTACCTT 
CCCTCATCTC 
CACTGGGTCC 
TTCAAGCGGT 
ACAGTTCTCT 
TGATTTCTGT 
TCACCCAAGG 
GATTTTGTTA 
AGGTAATTGC 
CAGTGATGGT 
TTCTAGAACT 
GTTATCTGAA 
TGCTTCCCAT 
AAGACTGCAC 
ATACCTCAGA 
TTCAACTCGA 



21 
I 

GCCTGTCATC 
GGGAATATTG 
AAAAGGAGGG 
CTTTCCAGGG 
AAGTTACAAT 
TGATCCTGAA 
TACTCTGCCT 
TACAGGTTTA 
ACACATACCA 
CGGGGTTATG 
AGAAGAACCC 
ATTTATCTGT 
GGACTTATTT 
TGGTGTCACT 
CAATGTGTTT 
CATCACTGTA 
CAATGGTGTG 
ACTGTCTGAA 
TGGTGCTGTG 
CCATGGGCAG 
GTCTCATGTT 
GTAA 



31 
I 

CCGCCGCAGA 
CTTTTATTCT 
GCCCTCTCTG 
TATCTGCTCC 
ATAATAGCTG 
AACGTGTTTA 
TTATCCTTGT 
ACAACTCTGA 
AAAACAGAAG 
TCTTTTGCAT 
ACAGTAGCTA 
ATATTCTTTG 
GAAAATTACT 
GTCATTTTGA 
TTTGGTGGGA 
GCCACGCTTG 
CTCTGTGCAA 
GAACCAAGGA 
GTGATGGTTT 
GAAATGTTCT 
CAGCAGACAA 



Seq ID NO: 671 Protein sequence 
Protein Accession #: Eos sequence 



MGYQRQEPVI 
LVNKTFGFPG 
GLSTVTFTLP 
PNAIQAVGVM 
FTGFTQGDLF 
IWTVMVITV 
SCVMLPIGAV 
ISIFQLE • 



11 
I 

PPQRGLPYSM 
YLLLSVLQFL 
LSLYRNIAKL 
SFAFICHHNS 
ENYCRNDDLV 
ATLVSLLIDC 
VMVFGFVMAI 



21 

! 

KQAGFPLGIL 
YPFIAMISYN 
GKVSLISTGL 
FLVYSSLEEP 
TFGRFCYGVT 
LGIVLELNGV 
TNTQDCTHGQ 



31 
I 

LIjFWVSYVTD 
IIAGDTLSKV 
TTLILGIVMA 
TVAKWSRLIH 
VILTYPMECF 
LCATPLIFII 
EMFYCFPDNF 



41 
I 

GAGGATTGCC 
GGGTTTCATA 
GAACAGATAC 
TCTCTGTTCT 
GAGATACTTT 
TTGGTCGCCA 
ACCGAAATAT 
TTCTTGGAAT 
ACGCTTGGGT 
TTATTTGCCA 
AGTGGTCCCG 
CTACATGTGG 
GCAGAAATGA 
CATACCCTAT 
ATCTTTCATC 
TGTCATTGCT 
CTCCCCTCAT 
CACACTCCGA 
TTGGATTCGT 
ACTGCTTTCC 
CACAACTTTC 



41 

I 

FSLVLLIKGG 
FQRIPGVDPE 
RAISLGPHZP 
MSIVISVFIC 
VTREVIANVF 
PSACYLKLSE 
SLTNTSESHV 



51 
I 

TTATTCAATG 
TGTTACAGAC 
CTACCAGTCT 
TCAGTTTTTG 
GAGCAAAGTT 
CTTCATTATT 
AGCAAAGCTT 
TGTAATGGCA 
ATTTGCAAAG 
CCATAACTCC 
CCTTATCCAT 
ATACTTGACA 
TGACCTGGTA 
GGAATGCTTT 
GGTTTTCCAC 
GATTGATTGC 
TTTTATCATT 
TAAGATTATG 
CATGGCTATT 
TGACAATTTC 
TACTTTAAAT 



51 
I 

ALSGTDTYQS 
NVFIGRHFII 
KTEDAWVFAK 
IFFATCGYLT 
FGGNLSSVFH 
EPRTHSDKIM 
QQTTQLSTLN 



Seq ID NO: 672 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence .• 1 . . 1203 



ATGGGCTACC 
AAAGGAGGGG 
TTTCCAGGGT 
AGTTACAATA 
GATCCTGAAA 
ACTCTGCCTT 
ACAGGTTTAA 
CACATACCAA 
GGGGTTATGT 
GAAGAACCCA 
TTTATCTGTA 
GACTTATTTG 
GGTGTCACTG 
AATGTGTTTT 
ATCACTGTAG 
AATGGTGTGC 
CTGTCTGAAG 
GGTGCTGTGG 
CATGGGCAGG 
TCTCATGTTC 



11 
I 

AGAGGCAGGA 
CCCTCTCTGG 
ATCTGCTCCT 
TAATAGCTGG 
ACGTGTTTAT 
TATCCTTGTA 
CAACTCTGAT 
AAACAGAAGA 
CTTTTGCATT 
CAGTAGCTAA 
TATTCTTTGC 
AAAATTACTG 
TCATTTTGAC 
TTGGTGGGAA 
CCACGCTTGT 
TCTGTGCAAC 
AACCAAGGAC 
TGATGGTTTT 
AAATGTTCTA 
AGCAGACAAC 



21 
I 

GCCTGTCATC 
AACAGATACC 
CTCTGTTCTT 
AGATACTTTG 
TGGTCGCCAC 
CCGAAATATA 
TCTTGGAATT 
CGCTTGGGTA 
TATTTGCCAC 
GTGGTCCCGC 
TACATGTGGA 
CAGAAATGAT 
ATACCCTATG 
TCTTTCATCG 
GTCATTGCTG 
TCCCCTCATT 
ACACTCCGAT 
TGGATTCGTC 
CTGCTTTCCT 
ACAACTTTCT 



31 

I 

CCGCCGCAGT 
TACCAGTCTT 
CAGTTTTTGT 
AGCAAAGTTT 
TTCATTATTG 
GCAAAGCTTG 
GTAATGGCAA 
TTTGCAAAGC 
CATAACTCCT 
CTTATCCATA 
TACTTGACAT 
GACCTGGTAA 
GAATGCTTTG 
GTTTTCCACA 
ATTGATTGCC 
TTTATCATTC 
AAGATTATGT 
ATGGCTATTA 
GACAATTTCT 
ACTTTAAATA 



41 
I 

TTTCCCTTGT 
TGGTCAATAA 
ATCCTTTTAT 
TTCAAAGAAT 
GACTTTCCAC 
GAAAGGTCTC 
GGGCAATTTC 
CCAATGCCAT 
TCTTAGTTTA 
TGTCCATCGT 
TTACTGGCTT 
CATTTGGAAG 
TGACAAGAGA 
TTGTTGTAAC 
TCGGGATAGT 
CATCAGCCTG 
CTTGTGTCAT 
CAAATACTCA 
CTCTCACAAA 
TTAGTATCTT 



60 
120 
180 
240 
300 
360 
420 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 



60 
120 
180 
240 
300 
360 
420 



51 

I 

TTTATTGATA 
AACTTTCGGC 
AGCAATGATA 
CCCAGGAGTT 
AGTTACCTTT 
CCTCATCTCT 
ACTGGGTCCA 
TCAAGCGGTC 
CAGTTCTCTA 
GATTTCTGTA 
CACCCAAGGG 
ATTTTGTTAT 
GGTAATTGCC 
AGTGATGGTC 
TCTAGAACTC 
TTATCTGAAA 
GCTTCCCATT 
AGACTGCACC 
TACCTCAGAG 
TCAACTCGAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



442 



WO 02/086443 

TAA 



PCTYUS02/12476 



5 
10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



Seq ID NOt 673 Protein sequence 
Protein Accession th Eos sequence 



MGYQRQEPVI 
SYNIIAGDTL 
TGLTTLILGI 
EEPTVAKWSR 
GVTVILTYPM 
NGVLCATPLI 
HGQEMFYCFP 



XI 
I 

PPQFSLVLLI 
SKVFORIPGV 
VMARAISLGP 
LIHMSIVISV 
ECFVTREVIA 
FIIPSACYLK 
DNFSLTNTSE 



21 

I 

KGGALSGTDT 
DPENVFIGRH 
HIPKTEDAWV 
PICIFFATCG 
NVFFGGNLSS 
LSEEPRTHSD 
SHVQQTTQLS 



31 


41 


51 




1 

YQSLVNKTFG 


1 

FPGYLLLSVL 


1 

QPLYPFIAMI 


60 


FIIGLSTVTF 


TLPLSLYRNI 


AKLGKVSLIS 


120 


FAKPNAIQAV 


GVMSFAFICH 


HNSFLVYSSIj 


180 


YLTFTGPTQG 


DLFENYCRND 


DLVTFGRFCY 


240 


VFHIWTVMV 


ITVATLVSLL 


IDCLGIVLEL 


300 


KIMSCVMLPI 


GAWMVFGFV 


MAITNTQDCT 


360 


TLNISIFQLB 









Seq ID NO: 674 DNA sequence 

Nucleic Acid Accession #: Eos sequence 

Coding sequence: 1..1140 



I 

ATGGGCTACC 
CCAGGGTATC 
TACAATATAA 
CCTGAAAACG 
CTGCCTTTAT 
GGTTTAACAA 
ATACCAAAAA 
GTTATGTCTT 
GAACCCACAG 
ATCTGTATAT 
TTATTTGAAA 
GTCACTGTCA 
GTGTTTTTTG 
ACTGTAGCCA 
GGTGTGCTCT 
TCTGAAGAAC 
GCTGTGGTGA 
GGGCAGGAAA 
CATGTTCAGC 



11 
I 

AGAGGCAGGA 
TGCTCCTCTC 
TAGCTGGAGA 
TGTTTATTGG 
CCTTGTACCG 
CTCTGATTCT 
CAGAAGACGC 
TTGCATTTAT 
TAGCTAAGTG 
TCTTTGCTAC 
ATTACTGCAG 
TTTTGACATA 
GTGGGAATCT 
CGCTTGTGTC 
GTGCAACTCC 
CAAGGACACA 
TGGTTTTTGG 
TGTTCTACTG 
AGACAACACA 



21 
I 

GCCTGTCATC 
TGTTCTTCAG 
TACTTTGAGC 
TCGCCACTTC 
AAATATAGCA 
TGGAATTGTA 
TTGGGTATTT 
TTGCCACCAT 
GTCCCGCCTT 
ATGTGGATAC 
AAATGATGAC 
CCCTATGGAA 
TTCATCGGTT 
ATTGCTGATT 
CCTCATTTTT 
CTCCGATAAG 
ATTCGTCATG 
CTTTCCTGAC 
ACTTTCTACT 



31 

I 

CCGCCGCAGG 
TTTTTGTATC 
AAAGTTTTTC 
ATTATTGGAC 
AAGCTTGGAA 
ATGGCAAGGG 
GCAAAGCCCA 
AACTCCTTCT 
ATCCATATGT 
TTGACATTTA 
CTGGTAACAT 
TGCTTTGTGA 
TTCCACATTG 
GATTGCCTCG 
ATCATTCCAT 
ATTATGTCTT 
GCTATTACAA 
AATTTCTCTC 
TTAAATATTA 



41 
I 

TCAATAAAAC 
CTTTTATAGC 
AAAGAATCCC 
TTTCCACAGT 
AGGTCTCCCT 
CAATTTCACT 
ATGCCATTCA 
TAGTTTACAG 
CCATCGTGAT 
CTGGCTTCAC 
TTGGAAGATT 
CAAGAGAGGT 
TTGTAACAGT 
GGATAGTTCT 
CAGCCTGTTA 
GTGTCATGCT 
ATACTCAAGA 
TCACAAATAC 
GTATCTTTCA 



51 
I 

TTTCGGCTTT 
AATG ATAA GT 
AGGAGTTGAT 
TACCTTTACT 
CATCTCTACA 
GGGTCCACAC 
AGCGGTCGGG 
TTCTCTAGAA 
TTCTGTATTT 
CCAAGGGGAC 
TTGTTATGGT 
AATTGCCAAT 
GATGGTCATC 
AGAACTCAAT 
TCTGAAACTG 
TCCCATTGGT 
CTGCACCCAT 
CTCAGAGTCT 
ACTCGAGTAA 



Seq ID NO: 675 Protein sequence 
Protein Accession ft: Eos sequence 



MGYQRQEPVI 
PENVFIGRHF 
IPKTEDAWVF 
ICIFFATCGY 
VFFGGNLSSV 
SEEPRTHSDK 
HVQQTTQLST 



11 
I 

PPQVNKTFGF 
IIGLSTVTFT 
AKPNAIQAVG 
LTFTGFTQGD 
FHIWTVMVI 
IMSCVMLPIG 
LNISIFQLE 



21 
I 

PGYLLLSVLQ 
LPLSLYRNIA 
VMSFAFICHH 
LFENYCRNDD 
TVATLVSLLI 
AWMVFGFVM 



31 

I 

FLYPFIAMIS 
KLGKVSLIST 
NSFLVYSSLE 
LVTFGRFCYG 
DCLGIVLELN 
AITNTQDCTH 



41 
I 

YNIIAGDTLS 
GLTTLILGIV 
EPTVAKWSRL 
VTVILTYPME 
GVIiCATPLIF 
GQEMFYCFPD 



51 
I 

KVFQRIPGVD 
MARAISLGPH 
IHMSIVISVF 
CFVTREVIAN 
IIPSACYLKL 
NFSLTNTSES 



Seq ID NO: 676 DNA sequence 

Nucleic Acid Accession #i NM_ 006853.1 

Coding sequence: 26.-874 



AGGAATCTGC 
ATCGGGCAGA 
CATGAGGATT 
CAGGATCATC 
CGAGAAGACG 
AGCCCACTGC 
GGAGGGCTGT 
CAGCCTCCCC 
CTCCATCACC 
CAGCTGCCTC 
CTTGCGATGC 
CAACATCACA 
GGGTGACTCC 
CCAGGATCCG 
GGACTGGATC 
ACCCTCCATT 
CAAGACCCTC 
AATCAACCTG 
GACTCTGGGA 
TCCTGGCCAT 



11 
I 

GCTCGGGTTC 
GGTCTCACAG 
CTGCAGTTAA 
AAGGGGTTCG 
CGGCTACTCT 
CTCAAGCCCC 
GAGCAGACCC 
AACAAAGACC 
TGGGCTGTGC 
ATTTCCGGCT 
GCCAACATCA 
GACACCATGG 
GGGGGCCCTC 
TGTGCGATCA 
CAGGAGACGA 
TCCACTTGGT 
TACGAACATT 
GGGTTCGAAA 
ATGACAACAC 
ATATCAAGGT 



21 

I 

CGCAGATGCA 
CAGCCAAGGA 
TCCTGCTTGC 
AGTGCAAGCC 
GTGGGGCGAC 
GCTACATAGT 
GGACAGCCAC 
ACCGCAATGA 
GACCCCTCAC 
GGGGCAGCAC 
CCATCATTGA 
TGTGTGCCAG 
TGGTCTGTAA 
CCCGAAAGCC 
TGAAGAACAA 
GTTTGGTTCC 
CTTTGGGCCT 
TCAGTGAGAC 
CTGGTTTGTT 
TTCAATAAAT 



31 
I 

GAGGTTGAGG 
ACCTGGGGCC 
TCTGGCAACA 
TCACTCCCAG 
GCTCATCGCC 
TCACCTGGGG 
TGAGTCCTTC 
CATCATGCTG 

cercTccTCA 

GTCCAGCCCC 
GCACCAGAAG 
CGTGCAGGAA 
CCAGTCTCTT 
TGGTGTCTAC 
TTAGACTGGA 
TGTTCACTCT 
CCTGGACTAC 
CTGGATTCAA 
CTCTGTTGTA 
ATTTGCTAAA 



41 
I 

TGGCTGCGGG 
CGCTCCTCCC 
GGGCTTGTAG 
CCCTGGCAGG 
CCCAGATGGC 
CAGCACAACC 
CCCCACCCCG 
GTGAAGATGG 
CGCTGTGTCA 
CAGTTACGCC 
TGTGAGAACG 
GGGGGCAAGG 
CAAGGCATTA 
ACGAAAGTCT 
CCCACCCACC 
GTTAATAAGA 
AGGAGATGCT 
ATTCTGCCTT 
TCCCCAGCCC 
TGAGTG 



51 
I 

ACTGGAAGTC 
CCCTCCAGGC 
GGGGAGAGAC 
CAGCCCTGTT 
TCCTGACAGC 
TCCAGAAGGA 
GCTTCAACAA 
CATCGCCAGT 
CTGCTGGCAC 
TGCCTCACAC 
CCTACCCCGG 
ACTCCTGCCA 
TCTCCTGGGG 
GCAAATATGT 
ACAGCCCATC 
AACCCTAAGC 
GTCACTTAAT 
GAAATATTGT 
CAAAGACAGC 



Seq ID NO: 677 Protein sequence 
Protein Accession #: NP_006844.1 

1 . 11 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



41 



51 



21 31 

I I I I 1 I 

MRILQUIiLA LATGLVGGET RIIKGFECKP HSQPWQAALF EKTRLLCGAT LIAPRWLLTA 



60 



443 



WO 02/086443 

AHCLKPRYIV HLGQHNLQKE EGCEQTRTAT ESFPHPGFNN SLPNKDHRND IMLVJ^SPV 
SITWAVRPLT LSSRCVTAGT SCLISGWGST SSPQLRLPHT LRCANITIIE HQKCENAYPG 
NITDTMVCAS VQEGGKDSCQ GDSGGPLVCN QSLQGIISWG QDPCAITRKP GVYTKVCKYV 
DWIQETMKNN 

Seq ID NO: 678 DNA sequence 

Nucleic Acid Accession ft: Eos sequence 

Coding sequence: 1..933 



PCT/US02/12476 



120 
180 
240 



10 

15 

20 

25 

30 

35 

40 

45- 

50 

55 

60 

65 

70 

75 

80 

85 



i 
i 

ATGTGCAGCA 
TTCGACAAGA 
TTCCCCTGTG 
GACTGTCCCG 
GCCCGCTACC 
AATAACTGTC 
GGGCAGGTGT 
ATCATCGGCA 
CACCAGCGGA 
CTGCTGTCCC 
AATAATGGCA 
CCACCCTCCT 
CCGCCCTACT 
CGGTCCGGGA 
GACACCAGCC 
TCTGAGCCGA 



11 
I 

ATGGACGGTG 
GTGATGAGAA 
CCAGCGGCAT 
ATGGCAGCGA 
ACTGCAAGAA 
AAGACAACAG 
TTGTGACTTC 
GCTCCGTCAT 
AGCGGAACAA 
GCCTGGTGGT 
TCCAGTATGT 
ACTCCGAGGC 
CTTCTGACAC 
GTGCCAACAG 
ACAGCCCGGG 
GCCAGGGCAC 



21 

I 

CATCCCGGGC 
GGAGTGCCCC 
CCATTGCATC 
TGAAGAGAAC 
CGGCCTCTGT 
TGATGAGGAA 
AGAGAACCAA 
TTTTGTGCTG 
CCTCATGACG 
CCTGGACCAC 
GGCCAGCCAG 
CTTGCTGGAC 
GGAATCTCTG 
TGCCAGCTCC 
GCAGCCTGGC 
TGAAGAAGTA 



31 
I 

GCCTGGCAGT 
AAGGCTAAGT 
ATTGGTCGCT 
TGCACAGCAA 
ATTGACAAGA 
AGCTGTGAAA 
CTTGTGTATT 
GTGGTGGCCC 
CTGCCCGTGC 
CCCCACCACT 
GCGGAGCAGA 
CAGAGGCCTG 
AACCAAGCCG 
CAGGCAGCCA 
CCCCAGGAGG 
TAA 



41 

I 

GTGACGGGCT 
CGAAATGTGG 
TCOGGTGCAA 
ACCCTCTGCT 
GCTTCATCTG 
GTTCTCAAGA 
ACCCCAGCAT 
TGCTGGCACT 
ACCGGCTGCA 
GCAACGTCAC 
ATGCGTCGGA 
CGTGGTATGA 
ACCTGCCCCC 
GCAGCCTCCT 
GCACTGCTGA 



51 

r 

GCCTGACTGC 
CCCGACCTTC 
TGGGTTTGAG 
TTGCTCCACC 
CGATGGACAG 
ACCCGGCAGT 
CACCTATGCC 
GGTCTTGCAC 
GCACCCTGTG 
CTACAACGTC 
AGTAGGCTCC 
CCTTCCTCCA 
CTACCGCTCC 
GAGCGTGGAA 
GCCCAGGGAC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 



Seq ID NO: . 679 Protein sequence 
Protein Accession ft: Eos sequence 



31 



1 11 

I I 
MCSNGRCIPG AWQCDGLPDC 
DCPDGSDEEN CTANPLLCST 
GQVFVTSENQ LVYYPSITYA 
LLSRLWLDH PHHOJVTYNV 
PPYSSDTESL NQADLPPYRS 
SEPSQGTEEV 



Seq ID NO: 680 DNA sequence 
Nucleic Acid Accession ft: S78203.1 
Coding sequence: 1..2190 



21 

I I _ 

PDKSDEKECP KAKSKCGPTF 
ARYHCKNGLC IDKSPICDGQ 
IIGSSVIFVL WALLALVLH 
NNGIQYVASQ AEQNASEVGS 
RSGSANSASS QAASSLLSVE 



41 

I 

FPCASGIHCI 
NNCQDNSDEE 
HQRKRNNLMT 
PPSYSEALLD 
DTSHSPGQPG 



51 
I 

IGRFRCNGFE 



I 



ATGAATCCTT 
GAGGTACCAC 
AACTATCCAC 
TATGGAATGA 
ACCTCCACAT 
GCAGCCATTG 
TATGTGCTTG 
GTACACACAG 
AAACCCTGTG 
ACTAGATACT 
ATCACACCCA 
TTTGGAGTTC 
ATATACAATA 
TTTGCTATTT 
CTAGACTGGG 
AGGGTACTAT 
TCACGATGGA 
CCGGACCAGA 
TTTGTCATTT 
GCTGTTGGTA 
ATAAATGAAA 
CTGGCAGATG 
GAGTCCATCA 
AGCCAGGATT 
GTGCAGGAGA 
ATGATGGTAA 
AACACTTTGC 
GAAGACTATG 
TGTAGAACAG 
TATCTGTTTG 
ATTCCAGCCA 
GGGGAGGTCA 
ATGAAATCTG 
CTTGTTGTGG 
CTCCTGCTGG 
ACAGAGGATA 
AAACTAGAGA 



11 
I . 

TCCAGAAAAA 
CTCGACCACC 
TGAGCATTGC 
AAGCTGTGCT 
CTATATACCA 
CTGACTCGTG 
GCCATGTGAT 
TCCTATCATT 
TGGCAGCTTT 
TCTCAGTCTT 
TGCTGAGAGG 
CAGGACTGCT 
AACCACCCCC 
CCAATCGTTT 
CAGCTGAGAA 
TCCTTTATAT 
CTTTGCAAGC 
TGCAGGTTCT 
ATCGTCTGGT 
TGATCCTAGC 
TGGCCCCAGC 
ATGAGGTGAA 
AATCCTTTCA 
TTCACTTCCA 
AGAACTGGTA 
AGGATACAGA 
ATAAAGATGT 
GTGTGTCTGC 
AAGATAAGAA 
TTATTACTAA 
ACAAAATGTC 
TGTTCTCTGT 
TGCTCCAGGC 
CACAGTTCAG 
TGATCTGCCT 
TGCGGGGTCC 
CCAAGAAGAC 



21 
I 

TGAGTCCAAG 
TAGCCCTCCA 
CTTCATTGTG 
GATCCTGTAT 
TGCCTTCAGC 
GTTGGGAAAA 
CAAGTCCTTG 
GATCGGCCTG 
TGGTGGAGAC 
CTACCTGTCC 
AGATGTGCAA 
CATGGTAATT 
TGAAGGAAAC 
CAAGAACCGT 
ATATCCAAAG 
CCCATTGCCC 
CATCAGGATG 
AAATCCCTTT 
CTCCAAGTGT 
GTGCCTGGCA 
CCAGTCAGGT 
GGTGACAGTG 
GAAAACACCA 
CCTGAAATAT 
CAGTCTTGTC 
AAGCAAAACA 
CAACATCTCC 
TTATAGAACT 
CTTTTCTCTG 
TAACACCAAT 
CATTGCGTGG 
CACAGGTCTT 
AGCTTGGCTA 
TGGCCTGGTA 
GATCTTCTCC 
AGCAGATAAG 
AAAACTCTGA 



31 



41 



LPVHRLQHPV 
QRPAWYDLPP 
PQEGTAEPRD 



51 



I 



GAAACTCTTT 
AAGAAGCCAT 
GTGAATGAAT 
TTCCTGTATT 
AGCCTCTGTT 
TTCAAGACAA 
GGTGCCTTAC 
AGTCTAATAG 
CAGTTTGAAG 
ATCAATGCAG 
TGTTTTGGAG 
GCACTTGTTG 
ATAGTGGCTC 
TCTGGAGACA 
CAGCTCATTA 
ATGTTCTGGG 
AATAGGAATT 
CTGGTTCTTA 
GGAATTAACT 
TTTGCAGTTG 
CCCCAGGAGG 
GTGGGAAATG 
CACTATTCCA 
CACAATTTGT 
ATTCGTGAAG 
ACCAATGGGA 
CTGAGTACAG 
GTGCAAAGAG 
AATTTGGGTC 
CAGGGTCTTC 
CAGCTACCAC 
GAGTTTTCTT 
TTGACAATTG 
CAGTGGGCCG 
ATCATGGGCT 
CACATTCCTC 



TTTCACCTGT 
CTCCGACAAT 
TCTGCGAGCG 
TCCTGCACTG 
ATTTTACTCC 
TCATCTATCT 
CAATACTGGG 
CTTTGGGGAC 
AAAAACATGC 
GGAGCTTGAT 
AAGACTGCTA 
TGTTTGCAAT 
AAGTTTTCAA 
TTCCAAAGCG 
TGGATGTAAA 
CTCTTTTGGA 
TGGGGTTTTT 
TCTTCATCCC 
TCTCATCACT 
CGGCAGCTGT 
TTTTCCTACA 
AAAACAATTC 
AACTGCACCT 
CTCTCTACAC 
ATGGGAACAG 
TGACAACCGT 
ATACCTCTCT 
GAGAATACCC 
TTCTAGACTT 
AGGCCTGGAA 
AATATGCCCT 
ATTCTCAGGC 
CAGTTGGGAA 
AATTCATTTT 
ACTACTATGT 
ACATCCAGGG 



60 
120 
180 
240 
300 



CTCCATTGAA 
CTGTGGCTCC 
CTTTTCCTAT 
GAATGAAGAT 
CATCCTGGGA 
CTCCTTGGTG 
AGGACAAGTG 
AGGAGGCATC 
AGAGGAACGG 
TTCTACATTT 
TGCATTGGCT 
GGGAAGCAAA 
ATGTATCTGG 
ACAGCACTGG 
GGCACTGACC 
TCAGCAGGGT 
TGTGCTTCAG 
GTTGTTTGAC 
TAGGAAAATG 
AGAGATAAAA 
AGTCTTGAAT 
TCTGTTGATA 
GAAAACAAAA 
TGAGCATTCT 
TATCTCCAGC 
GAGGTTTGTT 
CAATGTTGGT 
TGCAGTGCAC 
TGGTGCAGCA 
GATTGAAGAC 
GGTTACAGCT 
TCCCTCTAGC 
TATCATCGTG 
GTTTTCCTGC 
TCCTGTAAAG 
GAACATGATC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 



Seq ID NO: 681 Protein sequence 
Protein Accession ft: AAB34388.1 



444 



WO 02/086443 
i 11 



PCT/US02/12476 



10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 
85 



MNPFQKNESK 
YGMKAVLILY 
YVLGHVIKSL 
TRYFSVFYLS 
IYNKPPPEGN 
RVLFLYIPLP 
FVIYRLVSKC 
LADDEVKVTV 
VQBKNWYSLV 
EDYGVSAYRT 
IPANKMSIAW 
LWAQFSGLV 
KLETKKTKL 



ETLPSPVSIE 
FLYPLHWNED 
GALPILGGQV 
INAOSI/ISTP 
IVAQVFKCIW 
MPWALLDQQG 
GINFSSLRKM 

vgnennslli 
iredgnsiss 

VQRGEYPAVH 
QLPQYALVTA 
QWAEFILFSC 



21 
I 

EVPPRPPSPP 
TSTSIYHAFS 
VHTVLSLIGL 
ITPMLRGDVQ 
FAISNRFKNR 
SRWTLQAIRM 
AVGMILACLA 
ESIKSFQKTP 
MMVKDTESKT 
CRTEDKNFSL 
GEVMFSVTGL 
LLLVICLIFS 



31 
I 

KKPSPTICGS 
SLCYFTPILG 
SLIALGTGGI 
CFGEDCYALA 
SGDIPKRQHW 
NRNLGFPVUQ 
FAVAAAVEIK 
HYSKLHLKTK 
TNGMTTVRFV 
NLGLLDFGAA 
EFSYSQAPSS 
IKGYYYVPVK 



41 
I 

NYPLSIAFIV 
AAIADSWLGK 
KPCVAAFGGD 
FGVPGLLMVI 
LDWAAEKYPK 
PDQMQVLNPF 
INEMAPAQSG 
SQDFHFHLKY 
NTLHKDVNIS 
YLFVITNNTN 
MKSVLQAAWL 
TEDMRGPADK 



51 

I 

VNEFCERPSY 
FKTIIYLSLV 
OFEEKHABER 
ALWFAMGSK 
QLIMDVKALT 
LVLIFIPLFD 
PQEVFLQVLN 
HNLSLYTEHS 
LSTDTSLNVG 
QGLQAWKIED 
LTIAVGNIIV 
HIPHIQGNMI 



Seq ID NOt 682 DNA sequence 
Nucleic Acid Accession ft: NMJU6077. 
Coding sequence: 128.. 667 



TCGCTTTGTG 
CGCGATAGAA 
ACTGTAGATG 
CTTGGCTGTT 
GATGCTCCCC 
CTTGGGAGAC 
AAAAGGGAAA 
AAGAAGAAAT 
CAAAGCTCCT 
GACTGTAAGT 
CCTAGGGATT 
TTACTAGGTG 
GATTCTAACA 
AAACCTATTC 



11 
I 

ATTCTTGATC 
ACGTGTTCGC 
CCCTCCAAAT 
GGAGTTGCTT 
AAAAGCAAGA 
AGCGGGGAGT 
GTGGCTGCCC 
CCTGAAATGC 
GATGAAGAAA 
TTAATTCAAG 
GGGCCAGGAC 
GACTTTGATA 
ACAAAAGCTG 
CCATGTTCTA 



21 
I 

CGGAACTTTG 
TTGCCCAGAA 
CCTTGGTTAT 
GTGGCATGTG 
CGAGCAAGAC 
ACAAGATGAT 
AGTGCTCTCA 
TCAAACAATG 
CCCTGATTGC 
ATGCTGGACG 
CAGCAGACCT 
TGACAACAAC 
AATTTCTTCA 
AAAAAA 



31 
I 

TCACCCAGGA 
GAAGGGAAGG 
GGAATATTTG 
CCTGGGCTGG 
ACACACAGAT 
TCTTGTGGTT 
TGCTGCTGTT 
GGAATACTGT 
ATTATTGGCC 
TACTCAGATT 
AATTGACAAA 
CCCTCCATCA 
CCCAACTTAA 



41 

I 

ACCCCGGAAG 
CGCGAGTGAG 
GCTCATCCCA 
AGCCTTCGAG 
ACTGAAAGTG 
CGAAATGACT 
TCAGCCTACA 
GGCCAGCCCA 
CATGCAAAAA 
GCACCAGGCT 
GTCACTGGTC 
CAAGTGTTTG 
ATGTTCTTGA 



51 
I 

AGGTAGCTCA 
GAAAGGAGGT 
GTACACTCGG 
TATGCTTTGG 
AAGCAAGCAT 
TAAAGATGGG 
AGCAGATTCA 
AGGTGGTGGT 
TGCTGGGACT 
CTCAAACTGT 
ACCTAAAACT 
AAGCCTGTCA 
GATGAAAATA 



Seq ID NO: 683 Protein sequence 
Protein Accession ft: NP_057161.1 

1 11 21 31 41 51 

III III 
MPSKSLVMEY LAHPSTLGLA VGVACGMCLG WSLRVCFGML PKSKTSKTHT DTESEASILG 
DSGEYKMILV VRNDLKMGKG KVAAQCSHAA VSAYKQIQRR NPEMLKQWEY CGQPKVWKA 
PDEETLIALL AHAKMLGLTV SLIQDAGRTQ IAPGSQTVLG IGPGPADLID KVTGHLKLY 

Seq ID NO: 684 DNA sequence 

Nucleic Acid Accession #: NM_0 04 864.1 

Coding sequence: 26,. 952 



1 

I 

CGGAACGAGG 
TCAGATGCTC 
GGCCGAGGCG 
ATTCCGAGAG 
CTGGGAAGAT 
AGTGCGGCTG 
GGGGCTCCCC 
AAGGTCGTGG 
GCCOGCGCTG 
ATCTTCGTCC 
CCGCAGAGCG 
TCTGCACACG 
ACGGGAGGTG 
CATGCACGCG 
CTGCTGCGTG 
GTCGCTCCAG 
GGTCCTTCCA 
GGGCTCAAGG 
TTATTTATTA 
ACTGTGTATT 
AAAA 



11 
I 

GCAACCTGCA 
CTGGTGTTGC 
AGCCGCGCAA 
TTGCGGAAAC 
TCGAACACCG 
GGATCCGGCG 
GAGGCCTCCC 
GACGTGACAC 
CACCTGCGAC 
GCACGGCCCC 
CGTGCGCGCA 
GTCCGCGCGT 
CAAGTGACCA 
CAGATCAAGA 
CCCGCCAGCT 
ACCTATGATG 
CTGTGCACCT 
TTCCTGAGAC 
TTAATTTATT 
TATTTAAAAC 



21 
I 

CAGCCATGCC 
TGGTGCTCTC 
GTTTCCCGGG 
GCTACGAGGA 
ACCTCGTCCC 
GCCACCTGCA 
GCCTTCACCG 
GACCGCTGCG 
TGTCGCCGCC 
AGCTGGAGTT 
ACGGGGACGA 
CGCTGGAAGA 
TGTGCATCGG 
CGAGCCTGCA 
ACAATCCCAT 
ACTTGTTAGC 
GCGCGGGGGA 
ACCCGATTCC 
GGGGTGACCT 
TCTGGTGATA 



31 
I 

CGGGCAAGAA 
GTGGCTGCCG 
ACCCTCAGAG 
CCTGCTAACC 
GGCCCCTGCA 
CCTGCGTATC 
GGCTCTGTTC 
GCGTCAGCTC 
GCCGTCGCAG 
GCACTTGCGG 
CTGTCCGCTC 
CCTGGGCTGG 
OGCGTGCCCG 
CCGCCTGAAG 
GGTGCTCATT 
CAAAGACTGC 
GGCGACCTCA 
TGCCCAAACA 
TCTTGGGGAC 
AAAATAAAGC 



41 

I 

CTCAGGACGG 
CATGGGGGCG 
TTGCACTCCG 
AGGCTGCGGG 
GTCCGGATAC 
TCTCGGGCCG 
CGGCTGTCCC 
AGCCTTGCAA 
TCGGACCAAC 
CCGCAAGCCG 
GGGCCCGGGC 
GCCGATTGGG 
AGCCAGTTCC 
CCCGACACGG 
CAAAAGACCG 
CACTGCATAT 
GTTGTCCTGC 
GCTGTATTTA 
TCGGGGGCTG 
TGTCTGAACT 



51 
I 

TGAATGGCTC 
CCCTGTCTCT 
AAGACTCCAG 
CCAACCAGAG 
TCACGCCAGA 
CCCTTCCCGA 
CGACGGCGTC 
GACCCCAAGC 
TGCTGGCAGA 
CCAGGGGGCG 
GTTGCTGCCG 
TGCTGTCGCC 
GGGCGGCAAA 
AGCCAGCGCC 
ACACCGGGGT 
GAGCAGTCCT 
CCTGTGGAAT 
TATAAGTCTG 
GTCTGATGGA 
GTTAAAAAAA 



Seq ID NO: 685 Protein sequence 
Protein Accession 8: NP 0O48S5.1 



1 11 

I I 
MPGQELRTVN GSQMLLVLLV 
EDLLTRLRAN QSWEDSNTDL 
HRALFRLSPT ASRSWDVTRP 
ELHLRPQAAR GRRRARARNG 
IGACPSQFRA ANMHAQIKTS 
LAKDCHCI 



21 



31 



41 



LSWLPHGGAL SIAEASRASF PGPSELHSED 
VPAPAVRILT PEVRLGSGGH LKLRISRAAL 
LRRQLSLARP QAPALHLRLS PPPSQSDQLl 
DDCPLGPGRC CRLHTVRASL EDLGWADWVL 
LHRLKPDTEP APCCVPASYN PMVLIQKTDT 



51 
I 

SRFRELRKRY 
PEGLPEASRL 
AESSSARPQL 
SPREVQVTMC 
GVSLQTYDDL 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



60 
120 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



60 
120 
180 
240 
300 



Seq ID NO: 686 DNA sequence 
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Nucleic Acid Accession 8. NM_002423.2 
Coding sequence: 48.. 851 

.1 11 21 31 41 51 

5 I I I I I I 

ACCAAATCAA CCATAGGTCC AAGAACAATT GTCTCTGGAC G6CAGCTATQ CGACTCACCG 60 

TGCTGTGTGC TGTGTGCCTG CTGCCTGGCA GCCTGGCCCT GCCGCTGCCT CAGGAGGCGG 120' 

GAGGCATGAG TGAGCTACAG TGGGAACAGG CTCAGGACTA TCTCAAGAGA TTTTATCTCT 180 

ATGACTCAGA AACAAAAAAT GCCAACAGTT TAGAAGCCAA ACTCAAGGAG ATGCAAAAAT 240 

10 TCTTTGGCCT ACCTATAACT GGAATGTTAA ACTCCCGCGT CATAGAAATA ATGCAGAAGC 300 

CCAGATGTGG AGTGCCAGAT GTTGCAGAAT ACTCACTATT TCCAAATAGC CCAAAATGGA 360 

CTTCCAAAGT GGTCACCTAC AGGATCGTAT CATATACTCG AGACTTACOG CATATTACAG 420 

TGGATCGATT AGTGTCAAAG GCTTTAAACA TGTGGGGCAA AGAGATCCCC CTGCATTTCA 480 

GGAAAGTTGT ATGGGGAACT GCTGACATCA TGATTGGCTT TGCGCGAGGA GCTCATGGGG 540 

15 ACTCCTACCC ATTTGATGGG CCAGGAAACA CGCTGGCTCA TGCCTTTGCG CCTGGGACAG 600 

GTCTCGGAGG AGATGCTCAC TTCGATGAGG ATGAACGCTG GACGGATGGT AGCAGTCTAG 660 

GGATTAACTT CCTGTATGCT GCAACTCATG AACTTGGCCA TTCTTTGGGT ATGGGACATT 720 

CCTCTGATCC TAATGCAGTG ATGTATCCAA CCTATGGAAA TGGAGATCCC CAAAATTTTA 780 

AACTTTCCCA GGATGATATT AAAGGCATTC AGAAACTATA TGGAAAGAGA AGTAATTCAA 840 

20 GAAAGAAATA GAAACTTCAG GCAGAACATC CATTCATTCA TTCATTGGAT TGTATATCAT 900 

TGTTGCACAA TCAGAATTGA TAAGCACTGT TCCTCCACTC CATTTAGCAA TTATGTCACC 960 

CTTTTTTATT GCAGTTGGTT TTTGAATGTC TTTCACTCCT TTTATTGGTT AAACTCCTTT 1020 

ATGGTGTGAC TGTGTCTTAT TCCATCTATG AGCTTTGTCA GTGCGCGTAG ATGTCAATAA 1080 
ATGTTACATA CACAAATAAA TAAAATGTTT ATTCCATGGT AAATTTA 



25 



40 



Seq ID NO: 687 Protein sequence 
Protein Accession ft: NP_002414.1 



1 11 21 31 41 51 

30 | II I I I 

MRI/TVLCAVC LLPGSLALPIi PQEAGGMSEL QWEQAQDYLK RFYLYDSETK NANSLEAKLK 60 
EMQKFFGLPI TGMLNSRVIE IMQKPRCGVP DVAEYSLFPN SPKWTSKWT YRIVSYTRDL 120 
PHITVDRLVS KALNMWGKEI PLHFRKWWG TADIMIGFAR GAHGDSYPFD GPGNTLAHAF 180 
APGTGLGGDA HFDEDERWTD GSSLGINFLY AATHELGHSL GKGHSSDPNA VMYPTYGNGD 240 
35 PQNFKLSQDD IKGIQKLYGK RSNSRKK 



Seq ID NO j 688 DNA sequence 
Nucleic Acid Accession ft: NMJ>05221. 
Coding s equence : 1 . . 8 7 0 



1 11 21 31 41 51 

I I I I I I 

ATGACAGGAG TGTTTGACAG AAGGGTCCCC AGCATCCGAT CCGGCGACTT CCAAGCTCCG 60 

TTCCAGACGT CCGCAGCTAT GCACCATCCG TCTCAGGAAT CGCCAACTTT GCCCGAGTCT 120 

45 TCAGCTACCG ATTCTGACTA CTACAGCCCT ACGGGGGGAG CCCCGCACGG CTACTGCTCT 180 

CCTACCTCGG CTTCCTATGG CAAAGCTCTC AACCCCTACC AGTATCAGTA TCACGGCGTG 240 

AACGGCTCCG CCGGGAGCTA CCCAGCCAAA GCTTATGCCG ACTATAGCTA CGCTAGCTCC 300 

TACCACCAGT ACGGCGGCGC CTACAACCGC GTCCCAAGCG CCACCAACCA GCCAGAGAAA 360 

GAAGTGACCG AGCCCGAGGT GAGAATGGTG AATGGCAAAC CAAAGAAAGT TCGTAAACCC 420 

50 AGGACTATTT ATTCCAGCTT TCAGCTGGCC GCATTACAGA GAAGGTTTCA GAAGACTCAG 480 

TACCTCGCCT TGCCGGAACG CGCCGAGCTG GCCGCCTCGC TGGGATTGAC ACAAACACAG 540 

GTGAAAATCT GGTTTCAGAA CAAAAGATCC AAGATCAAGA AGATCATGAA AAACGGGGAG 600 

ATGCCCCCGG AGCACAGTCC CAGCTCCAGC GACCCAATGG CGTGTAACTC GCCGCAGTCT 660 

CCAGCGGTGT GGGAGCCCCA GGGCTCGTCC CGCTGGCTCA GCCACCACCC TCATGCCCAC 720 

55 CCTCCGACCT CCAACCAGTC CCCAGCGTCC AGCTACCTGG AGAACTCTGC ATCCTGGTAC 780 

ACAAGTGCAG CCAGCTCAAT CAATTCCCAC CTGCCGCCGC CGGGCTCCTT ACAGCACCCG 840. 
CTGGCGCTGG CCTCCGGGAC ACTCTATTAG 

Seo » 1D N 0 J 6 ^ 9 Protein sequence 
60 Protein Accession ft: NP_005212.1 

1 11 21 31 41 51 

I I I I I I 

MTGVFDRRVP SIRSGDFQAP FQTSAAMHHP SQESPTLPES SATDSDYYSP TGGAPHGYCS 60 

65 PTSASYGKAL NPYQYQYHGV NGSAGSYPAK AYADYSYASS YHQYGGAYNR VPSATNQPEK 120 

EVTEPEVRMV NGKPKKVRKP RTIYSSFQLA ALQRRFQKTQ YLALPERAEL AASLGLTQTQ 180 

VKIWFQNKRS KIKKIMKNGE MPPEHSPSSS DPMACNSPQS PAVWEPQGSS RSLSHHPHAH 240 
PPTSNQSPAS SYLENSASWY TSAASSINSH LPPPGSLQHP LALASGTLY 
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It is understood that the examples described above in no way serve to limit the true 
scope of this invention, but rather are presented for illustrative purposes. All publications, 
sequences of accession numbers, and patent applications cited in this specification are herein 
ncorporated by reference as if each individual publication or patent application were 
specifically and individually indicated to be incorporated by reference. 
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WHAT IS CLAIMED IS: 

1 LA method of detecting a lung cancer-associated transcript in a cell 

2 from a patient, the method comprising contacting a biological sample from the patient with a 

3 polynucleotide that selectively hybridizes to a sequence at least 80% identical to a sequence 

4 as shown in Tables 1 A- 1 6. 

1 2. The method of claim 1 , wherein the polynucleotide selectively 

2 hybridizes to a sequence at least 95% identical to a sequence as shown in Tables 1 A-16. 

1 3. The method of claim 1 , wherein the biological sample is a tissue 

2 sample. 

1 4. The method of claim 1 , wherein the biological sample comprises 

2 isolated nucleic acids. 

1 5. The method of claim 4, wherein the nucleic acids are mRNA. 

1 6. The method of claim 4, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 7. The method of claim 1 , wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1 A-16. 

1 8. The method of claim 1 , wherein the polynucleotide is labeled. 

1 9. The method of claim 8, wherein the label is a fluorescent label. 

1 1 0. The method of claim 1 , wherein the polynucleotide is immobilized on 

2 a solid surface. 

1 11. The method of claim 1 , wherein the patient is undergoing a therapeutic 

2 regimen to treat lung cancer. 

1 12. The method of claim 1, wherein the patient is suspected of having lung 

2 cancer. 

1 1 3 . A method of monitoring the efficacy of a therapeutic treatment of lung 

2 cancer, the method comprising the steps of: 

448 
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3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a lung cancer-associated transcript in the 

6 biological sample by contacting the biological sample with a polynucleotide that selectively 

7 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1 A-16, 

8 thereby monitoring the efficacy of the therapy. 

1 14. The method of claim 13, further comprising the step of: (iii) comparing 

2 the level of the lung cancer-associated transcript to a level of the lung cancer-associated 

3 transcript in a biological sample from the patient prior to, or earlier in, the therapeutic 

4 treatment. . 

1 15. The method of claim 13, wherein the patient is a human. 

1 16. A method of monitoring the efficacy of a therapeutic treatment of lung 

2 cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 

5 (ii) determining the level of a lung cancer-associated antibody in the biological 

6 sample by contacting the biological sample with a polypeptide encoded by a polynucleotide 

7 that selectively hybridizes to a sequence at least 80% identical to a sequence as shown in 

8 Tables 1A-16, wherein the polypeptide specifically binds to the lung cancer-associated 

9 antibody, thereby monitoring the efficacy of the therapy. 



1 17. The method of claim 16, further comprising the step of: (iii) comparing 

2 the level of the lung cancer-associated antibody to a level of the lung cancer-associated 

3 antibody in a biological sample from the patient prior to, or earlier in, the therapeutic 

4 treatment. 

1 18. The method of claim 1 6, wherein the patient is a human. 

1 1 9. A method of monitoring the efficacy of a therapeutic treatment of lung 

2 cancer, the method comprising the steps of: 

3 (i) providing a biological sample from a patient undergoing the therapeutic 

4 treatment; and 
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5 (ii) determining the level of a lung cancer-associated polypeptide in the 

6 biological sample by contacting the biological sample with an antibody, wherein the antibody 

7 specifically binds to a polypeptide encoded by a polynucleotide that selectively hybridizes to 

8 a sequence at least 80% identical to a sequence as shown in Tables 1A-16, thereby 

9 monitoring the efficacy of the therapy. 

1 20. The method of claim 1 9, further comprising the step of: (iii) comparing 

2 the level of the lung cancer-associated polypeptide to a level of the lung cancer-associated 

3 polypeptide in a biological sample from the patient prior to, or earlier in, the therapeutic 

4 treatment. 

2 1 . The method of claim 1 9, wherein the patient is a human, 

22. An isolated nucleic acid molecule consisting of a polynucleotide 
sequence as shown in Tables 1 A-16. 

23. The nucleic acid molecule of claim 22, which is labeled. 

24. The nucleic acid of claim 23, wherein the label is a fluorescent label 

25. An expression vector comprising the nucleic acid of claim 22. 

26. A host cell comprising the expression vector of claim 25 . 

27. An isolated polypeptide which is encoded by a nucleic acid molecule 
having polynucleotide sequence as shown in Tables 1A-16. 

28. An antibody that specifically binds a polypeptide of claim 27. 

29. The antibody of claim 28, further conjugated to an effector component. 

30. The antibody of claim 29, wherein the effector component is a 
fluorescent label. 

3 1 . The antibody of claim 29, wherein the effector component is a 
radioisotope or a cytotoxic chemical. 

32. The antibody of claim 29, which is an antibody fragment. 
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1 33. The antibody of claim 29, which is a humanized antibody 

1 34. A method of detecting a lung cancer cell in a biological sample from a 

2 patient, the method comprising contacting the biological sample with an antibody of claim 

3 28. 

1 35. The method of claim 34, wherein the antibody is further conjugated to 

2 an effector component. 

1 36. The method of claim 35, wherein the effector component is a 

2 fluorescent label. 

1 37. A method of detecting antibodies specific to lung cancer in a patient, 

2 the method comprising contacting a biological sample from the patient with a polypeptide 

3 encoded by a nucleic acid comprises a sequence from Tables 1 A-l 6. 

1 38. A method for identifying a compound that modulates a lung cancer- 

2 associated polypeptide, the method comprising the steps of: 

3 (i) contacting the compound with a lung cancer-associated polypeptide, the 

4 polypeptide encoded by a polynucleotide that selectively hybridizes to a sequence at least 

5 80% identical to a sequence as shown in Tables 1 A-16; and 

6 (ii) determining the functional effect of the compound upon the polypeptide. 

1 39. The method of claim 38, wherein the functional effect is a physical 

2 effect. * 

1 40. The method of claim 38, wherein the functional effect is a chemical 

2 effect. 

1 41. The method of claim 38, wherein the polypeptide is expressed in a 

2 eukaryotic host cell or cell membrane. 

1 42. The method of claim 38, wherein the functional effect is determined by 

2 measuring ligand binding to the polypeptide. 

1 43. The method of claim 38, wherein the polypeptide is recombinant. 
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1 44. A method of inhibiting proliferation of a lung cancer-associated cell to 

2 treat lung cancer in a patient, the method comprising the step of administering to the subject a 

3 therapeutically effective amount of a compound identified using the method of claim 38. 

1 45 , The method of claim 44, wherein the compound is an antibody. 

1 46. The method of claim 45, wherein the patient is a human. 

1 47. A drug screening assay comprising the steps of 

2 (i) administering a test compound to a mammal having lung cancer or a cell 

3 isolated therefrom; 

4 (ii) comparing the level of gene expression of a polynucleotide that selectively 

5 hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1A-16 in a 

6 treated cell or mammal with the level of gene expression of the polynucleotide in a control 

7 cell or mammal, wherein a test compound that modulates the level of expression of the 

8 polynucleotide is a candidate for the treatment of lung cancer. 

1 48. The assay of claim 47, wherein the control is a mammal with lung 

2 cancer or a cell therefrom that has not been treated with the test compound. 

1 49. The assay of claim 47, wherein the control is a normal cell or mammal. 

1 50. A method for treating a mammal having lung cancer comprising 

2 administering a compound identified by the assay of claim 47. 

1 51. A pharmaceutiPcal composition for treating a mammal having lung 

2 cancer, the composition comprising a compound identified by the assay of claim 47 and a 

3 physiologically acceptable excipient 
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